Jd-^ DEPOSITORY T^^C 1979 NflVflL RfSfflRCH o r : o C3 -4 I- -I pi DECEMBER 1979 VOL. 26, NO. 4 OFFICE OF NAVAL RESEARCH NAVSO P-1278 c/on-3 NAVAL RESEARCH LOGISTICS QUARTERLY EDITORIAL BOARD Marvin Denicoff, Office of Naval Research, Chairman Ex Officio Members Murray A. Geisler, Logistics Management Institute W. H, Marlow, The George Washington University Bruce J. McDonald, Office of Naval Research Tokyo Thomas C. Varley, Office of Naval Research Program Director Seymour M, Selig, Office of Naval Research Managing Editor MANAGING EDITOR Seymour M. Sehg Office of Naval Research Arlington, Virginia 22217 ASSOCIATE EDITORS Frank M. Bass, Purdue University Jack Borsting, Naval Postgraduate School Leon Cooper, Southern Methodist University Eric Denardo, Yale University Marco Fiorello, Logistics Management Institute Saul 1. Gass, University of Maryland Neal D. Classman, Office of Naval Research Paul Gray, University of Southern California Carl M. W^xris, Mathematica, Inc. Arnoldo Hax, Massachusetts Institute of Technology Alan J. Hoffman, IBM Corporation Uday S. Karmarkar, University of Chicago Paul R. Kleindorfer, University of Pennsylvania Darwin Klingman, University of Texas, Austin Kenneth O. Kortanek, Carnegie-Mellon University Charles Kriebel, Carnegie-Mellon University Jack Laderman, Bronx, New York Gerald J. Lieberman, Stanford University Clifford Marshall, Polytechnic Institute of New York John A. Muckstadt, Cornell University William P. Vitrsk^Mdi, Northwestern University Thomas L. Saaty, University of Pennsylvania Henry Solomon, The George Washington University Wlodzimierz Szwarc, University of Wisconsin, Milwaukee James G. Taylor, Naval Postgraduate School Harvey M. Wagner, The University of North Carolina John W, Wingate, Naval Surface Weapons Center, White Shelemyahu Zacks, Case Western Reserve University The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics a will publish research and expository papers, including those in certain areas of mathematics, statistics, and economi relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Information for Contributors is indicated on inside back cover. The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, Ju September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printi Office, Washington, D.C. 20402. Subscription Price: $11.15 a year in the U.S. and Canada, $13.95 elsewhere. Cost 1 individual issues may be obtained from the Superintendent of 'Documents. The views and opinions expressed in this Journal are those of the authors and not necessarily those of the Off;' of Naval Research. ' i Issuance of this periodical approved in accordance with Department of the Navy Publications and Prmting Regulatio' P-35 (Revised 1-74). OPTIMAL SET PARTITIONING, MATCHINGS AND LAGRANGIAN DUALITY* George L. Nemhauser School of Operations Research and Industrial Engineering Cornell University Ithaca, New York Glenn M. Weber Christopher Newport College Newport News, Virginia ABSTRACT We formulate the set partitioning problem as a matching problem with sim- ple side constraints. As a result we obtain a Lagrangian relaxation of the set partitioning problem in which the primal problem is a matching problem. To solve the Lagrangian dual we must solve a sequence ot" matching problems each with different edge-weights. We use the cyclic coordinate method to iterate the multipliers, which implies that successive matching problems differ in only two edge-weights. This enables us to use sensitivity analysis to mndil'y one optimal matching to obtain the next one. We give theoretical and empirical compari- sons of these dual bounds with the conventional linear programming ones. 1. INTRODUCTION We consider the set partitioning problem n max Y d/y/ 7 = 1 iSP) Z «,/>'/ = \, i = 1, ... . m 7 = 1 y, e{0,\]. j = \, .... n where di is an arbitrary real number for all j and o„ € (0, 1} for all / and / Balas and Padberg [1] give a survey of applications and methods for solving the set partitioning problem. Except for algorithms developed for small size problems, most algorithms for solving the set partition- ing problem use linear programming relaxations. However, for large size problems, because of degeneracy, the linear programs obtained by replacing the binary restriction on each yj in (SP) ^y yi ^ often are difficult to solve (Marsten [12]). As a result, the typically large size (and sparse) set partitioning problem sometimes cannot be solved. *This work has been supported by National Science Foundation Grant ENG75-00568 to Cornell University. 553 554 G.L. NEMHAUSER &G.M. WEBER We consider a different relaxation that uses matchings on graphs and Lagrangian duality. This is accomplished by reformulating the set partitioning problem as a (weighted, perfect) m matching problem, a version of (SP) in which ^ o,^ = 2 for y = 1, . . . , /j, with simple side ; = 1 constraints. The side constraints are incorporated into the objective function in a Lagrangian ! fashion, resulting in the primal Lagrangian matching relaxation. The matching problem is one of the few tractable combinatorial problems, and thus is an attractive relaxation for large set partitioning problems. 2. LAGRANGIAN MATCHING RELAXATION m In (SP) let a I = (oi^, .... a,„j) and suppose that for all J, ^ a,y = 2Ki, for some integer m Kj. This places no limitation on the generality of (SP) since if ^ a,j = IKj — 1 then a new constraint y, ^ 1 can be added to the problem. We replace Oj by the set of columns [aj]^L^, where ^ af =0;, X ^'5 ^ ^' ^ ^ ^' • • • - ^i ^"d a^i € {0,1}, for all / and k. One can form k=\ /=1 these columns in such a way that both nonzero components of af precede those of af '^', for all k. Column a/ is given an objective function coefficient of Cj = dil Kj and is associated with the variable Xj^ ^ {0,1}. The [xj^] are required to satisfy Xy^. = x^j^+i, k = \, . . . , A^^ — 1, for all y. We thus obtain a problem equivalent to {SP) given by max £ Y, ^i^ik / = 1 k = \ (MS) ± ia^,x„ = \, i = l, .... m / = 1 k = \ x,k - Xi,k + \ = 0, 7 = 1, .... « and /: = 1, ... , K, - \ Xji, € {0.1}, ally and k, which is a matching problem with side constraints x^^ = Xji,+\. Using matrix notation, thisj problem can be written as 1 max ex Mx = 1 Sx = X binary where Mand Sare the coefficient matrices of the matching and side constraints, respectively. A solution to {MS) yields a solution for {SP) given in PROPOSITION 1: If {x%\ is an optimal solution of {MS) then y) = x% J = \, .... n an optimal solution to {SP). IS Let G{\, x) = {c - kS)x where the domain of x is {x|A/x = 1 and x binary} and X is an unrestricted vector of Lagrange multipliers. The Lagrangian relaxation of {MS) relative to 5x = is OPTIMAL SET PARTITIONING 555 (£^ ) Fix) = max G{k, x) . Without matrix notation, the Lagrangian relaxation can be written as n ^j max £ Y ^0 - ^^jk - ^j.k-\))Xji, 7 = 1 -^ = 1 Z Z "'/ -^/^ = 1 ' i = \. ■■■ < m 7 = 1 A: = l Xii, € {0,1}, ally and k where, for y = 1, . . . , «, \^o and \ji^ are defined to be zero. {LR^ is a (weighted, perfect) matching problem. Relaxations play a very important role in integer programming algorithms. To be worthwhile, the relaxed problem should be easier to solve than the original one and should also yield a tight bound on the original problem solution. Lagrangian relaxations often fulfill both of these criteria. Since one of the criteria of a good relaxation is the tightness of its bound, the best choice for \ in {LR^) is the one that optimizes the Lagrangian dual {LD) min F{k) where \ = (X], . . . , \„) and \, = (\,i X^j^- _i), for y = 1, . . . , n. Let \{P) represent the optimal objective function value of any problem (P), and let (SPLP) represent the linear programming relaxation of (SP). Proposition 2 (see Geoffrion [7]) summarizes the relationships between (SP), (SPLP), (MS), (LR,,) and (ID). PROPOSITION 2: (a) \(SP) = v(MS) ^ w(LPSP), (b) for all X, \(MS) < y(LR>) (=F(X)), (c) if for a given X a vector x is optimal in (LR^) and Sx = 0, then X is an optimal solution of (MS), and (d) w(LD) ^ w(SPLP). Note that the Lagrangian relaxation using the X found in (LD) is at least as tight as the linear programming relaxation; this is a consequence of the fact that matrix M is not totally unimodu- lar. Typically, \(SP) < v(LD) < \(SPLP). 3. OPTIMIZING THE LAGRANGIAN DUAL Many methods (surveyed in Fisher, Northup and Shapiro [6] and Bazaraa and Goode [2]) have been proposed for solving Lagrangian duals. By far the most widely used is the subgra- dient optimization method described in Held and Karp [8] and Held, Wolfe and Crowder [9]. Compared to other methods, very little "overhead" is needed and, most importantly, it has proven to be very effective computationally. In subgradient optimization, a sequence {X'} of multiplier vectors is generated iteratively, using at each iteration the solution F(\'). Many components of each X' change from iteration to iteration, and in the context of solving (LD), new optimal matchings must be solved "from scratch." Although solving a large matching prob- lem is much easier than solving a large linear programming problem, it still can be time con- suming. 556 G.L. NEMHAUSER & G.M. WEBER The (weighted, perfect) matching problem max ex Mx = 1 X binary where each column of M contains exactly two nonzero entries, both equal to one, can be inter- preted graphically by letting M be the node-edge incidence matrix of a graph in which each row of M represents a node and each column represents an edge where edge k meets node / if and only if m,^ = 1, and q is the weight assigned to edge k. The problem then is to choose a set of edges, called a feasible matching, so that each node meets exactly one of the edges selected, in such a way that the sum of the weights on the edges chosen is as large as possible. Edmonds [3,4,5] developed an efficient (polynomially bounded) primal-dual algorithm for solving the matching problem and Weber [14] showed how sensitivity analysis can be performed on optimal matchings to get the new optimal solution from the original optimal solution if the weight on an edge is changed. Except for some very simple special cases, the techniques involve modifying the graph by attaching additional nodes and edges near the edge whose edge-weight is to be altered. Edmonds's algorithm is re-entered with all the needed properties, including comple- mentary slackness, being maintained. The final primal and dual solutions for the modified problem are then "translated back" to yield the optimal matching for the single altered edge- weight problem, in such a way that again, all the needed properties are maintained (and thus, the process can be repeated if other edge-weights are altered). Reoptimizing using these tech- niques when a single edge-weight is altered is on the order of cardinality (A^) more efficient than using Edmonds' algorithm "from scratch," where A^ equals the number of nodes in the graph. Because of the special structure present in the side constraints of {MS) in which each variable appears in at most two equations, we choose to attempt to optimize {LD) by using an improved version of the cyclic coordinate method of nonlinear programming. The structure of the S matrix results in each \ n^ appearing in the coefficient of at most two variables in G{\, x). This allows the sensitivity analysis techniques to be used to improve significantly the usual cyclic coordinate method. In this method, F{\) is optimized cyclically in each of the coordinate directions. Thus, after initializing \, we minimize F{\) with respect to Xn, .... Xjj^ _i, ... , n ^n\' ■ ■ ■ > ^n.K -1 •" that order, one at a time. This process, which involves ^ (A^, — 1) single 7 = 1 variable minimizations, is repeated until the objective function stops decreasing. Typically, each one-variable minimization is accomplished by one of the iterative or grid type of pro- cedures used in unconstrained optimization algorithms. However, because of the special struc- ture of the problem. Theorem 1 provides a direct formula for each minimization, thus avoiding the time consuming "line searches." The proof of Theorem 1 appears in the Appendix. THEOREM 1: Suppose x* is the current optimal matching vector, \* is the current optimal Lagrange multiplier vector, \ and X are identical to A. * except for the \,; component, and \,^ = l-l-|c-\*S|l and X^^ = -1 - |c - \ * 5| • 1, x maximizes G{k, x) and x max- imizes Gik, x). An optimal A. *^* that minimizes F{\) with respect to \ i,, with all other com- ponents of \ fixed at their values in \* depends on x*, x or x, and \* as follows: CASE 1: If x% = x*^,+i = 1 then \%*= k% . CASE 2: If x*^ = x*^+, = then k%*= k%. OPTIMAL SET PARTITIONING 557 CASE 3: If x*^ = 1 and x'^^+i = then (a) if Xjk = 1 and x^a+i = then X*^»= oo (LD is unbounded and MS is infeasible), (b) if Xjk = 1 and x,;t+i = 1 then (c) if Xjif = and x^^^+i = then (d) if xji( = and Xy^^+i = 1 then ^j*= ^% + (1/2) [U,, - Xp - (FXi) - Fix*))]. CASE 4: If x% = and x*^+, = 1 then (a) if Xji^ = and x^/c+i = 1 then X*^*= — oo (LD is unbounded and MS is infeasible), (b) if Xji, = and Xy /c+i = then \*k*=^ji<- (f(^*) -Fix)], (c) if Xji^ = 1 and Xy/t+i = 1 then X*k*=>^ji<- [Fix*) -Fix)], id) if Xjif = 1 and X/y^+i = then ^**= ^jk - (1/2)[(X*, - X,,) - iFiX) - Fix*))]. Theorem 1 is fairly easy to implement. The only unknown quantities in the formulas for y^*are FiX) and FiX) and depending on x* at most one of these must be found. Computa- onally, the task of finding either one of these quantities is quite simple, since it is not neces- iry to solve a new matching problem "from scratch," but only to use sensitivity analysis tech- iques to reoptimize the matching with two edge-weights altered. The techniques are applied n those edges, one at a time. After computing X**, the two edge-weights are again altered sing X *fc*and a new optimal matching is determined using the sensitivity analysis techniques. At each step of the cyclic coordinate method a new vector X is generated, differing from le previous X by at most one component. Let X' represent the /-th such vector. THEOREM 2: Assuming v(MS) exists, the sequence {FCX')} converges. PROOF: For all /, F(X'"^') < F(X') and, by Proposition 2, the sequence has a lower ound of \iMS). A bounded, nonincreasing sequence has a limit. D Let /^denote the limit of {F(X')}. Zangwill [15] and Luenberger [11] give mild restric- ons, including Fix) having continuous first partial derivatives and a unique minimum point ■ong any coordinate direction, that guarantee global convergence of the cyclic coordinate lethod. Unfortunately, Fix) violates these restrictions. It is not necessarily true that = viLD). In fact, if FiX) = F then X = X might not even be a local minimum, since it is tily a relative minimum with respect to the coordinate directions. 558 G.L. NEMHAUSER & G M. WEBER The following is an example in which the sequence {F(X')} generated by the cyclic coordi- nate method does not converge to \{LD). Let O5 be the null matrix of order 5 and Z = Let the A matrix of the set partitioning version of {SP) be the 20 x 25 matrix 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 \ Hi Let the objective function coefficients corresponding to the first 20 columns of A be 1 and for the last 5 columns be 0. The solutions are v(SP) = 2 where >'i = yia = ^22 = yn = >'24 = ^- yj = 0, otherwise, wiSPLP) = 5 where yj = 1/4, j = 1, ... , 20 and yj = 0, otherwise, \{LD) = 2_where Xio.i = -1/2, Xni = 1/2, X22,i = 1. ^24,1 = ~1. ^jk= 0» otherwise, {F(\')) — * /" = 3 using the cyclic coordinate method (established empirically). Thus, \{SP) = \{LD) < Fix) < viSPLP). Notice that the bound achieved from solv- ing the Lagrangian dual, for which the subgradient method is successful (established empiri- cally), and the bound from the cyclic coordinate method are both superior to the one obtained by using linear programming. It should be pointed out that in the usual implementation of subgradient optimization, global convergence is not guaranteed. In addition to avoiding "from scratch" solutions to large matching problems, another important reason for choosing the cyclic coordinate method instead of the subgradient method is that subgradient optimization lacks an important property of the easy to perform cyclic coor- dinate method. In using subgradient optimization, the sequence {F(\')} is not monotone; it OPTIMAL SET PARTITIONING 559 an take quite a few iterations until any progress is made in minimizing F(k). Since in a )ranch-and-bound method we are more interested in getting a close approximation for \{LD) n a short period of time than we are in solving it exactly, it seems reasonable to choose a nethod that begins showing progress in minimizing F{\) immediately. Actual computational omparisons of the two methods are given in the next section. I. COMPUTATIONAL RESULTS The results of the computational experiments performed only at the initial node of a >ranch-and-bound tree are summarized in three tables. Fourteen problems of varying sizes v-ere run using the cyclic coordinate method (Table 1) and the subgradient method (Table 2) or optimizing the Lagrangian dual (LD), and using linear programming (Table 3) for solving he continuous relaxation of iSP). Each problem contains exactly four ones per column, and n the tables, each is labeled type S2, Rl or RR. S2 is the example given in Section 2. The )ther two types have constraint matrices consisting of a randomly generated portion containing i-m/4 columns and a set of w/4 "dummy" columns to insure feasibility. The /-th such dummy" column contains ones in rows 4/— 3, 4/— 2, 4/— 1 and 4/, and zeros elsewhere. The ibjective function coefficients are zero for these columns. Types Rl and RR differ in the objective function coefficients for the other columns. Problems of type Rl have all the oefficients equal to one, while problems of type RR have randomly generated integer oefficients with values between one and ten. TABLE 1 . Cyclic Coordinate Method'' Problem m X n Type of Data Initial Value Final Value 5x=0? Cycle No. at Termination Iterations Time(sec.)t on IBM 370/168 1 20 X 20 Rl 4.5 3 No 2 13 10 2 20 X 20 Rl 5 3.98 No 4 30 15 3 20 X 25 S2 4 3* No 13 67 4.78 4 20 X 50 Rl 5 5 No 5 40 15 5 20 X 50 Rl 5 5 No 2 14 10 6 20 X 50 RR 40.5 35.00 No 5 32 30 7 40 X 40 Rl 8.5 3.80 No 2 22 20 8 40 X 40 Rl 9.5 8.25 No 1 16 20 9 40 X 100 Rl 10 10 No 1 15 15 10 40 X 100 Rl 10 10 No 1 17 20 11 40 X 100 RR 84.5 81.06 No 1 13 30 12 60 X 60 Rl 13 9.70 No 2 29 60 13 60 X 150 Rl 15 15 No 1 18 60 14 100 X 250 Rl 25 25 No 1 5 60 *The program for the matching algorithm is given in [13]. tCPU time. Integer values indicate arbitrarily set CPU time limits. tConverging to within .00001 of 3. 560 G.L. NEMHAUSER & G.M. WEBER TABLE 2. Subgradient Method Problem m X n Type of Data Initial Value Final Value Best Value Sx = 0? Iterations Time (sec.) on IBM 370/168 1 20 X 20 Rl 4.5 Yes 9 1.45 2 20 X 20 Rl 5 .60 .57 No 77 15 ,- 3 20 X 25 S2 4 2 2 Yes 9 1.27 I 4 20 X 50 Rl 5 5.43 5 No 29 15 1 5 20 X 50 Rl 5 5.58 5 No 34 ■ 6 20 X 50 RR 40.5 32.94 32.81 No 69 H 7 40 X 40 Rl 8.5 Yes 10 5.98 1 8 40 X 40 Rl 9.5 Yes 12 7.86 H 9 40 X 100 Rl 10 18.33 10 No 13 20 10 40 X 100 Rl 10 21.24 10 No 12 20 11 40 X 100 RR 84.5 105.06 84.5 No 17 30 12 60 X 60 Rl 13 Yes 16 37.10 13 60 X 150 Rl 15 34.15 15 No 11 60 14 100 X 250 Rl 25 106.53 25 No 5 60 TABLE 3. Linear Programming'f Problem m X n Type of Data Final Value Optimal? Binary? Iterations Time (sec.) on IBM 370/168 1 20 X 20 Rl Yes Yes 19 .29 2 20 X 20 Rl Yes Yes 24 .34 3 20 X 25 S2 5 Yes No 20 .31 4 20 X 50 Rl 5 Yes No 42 .58 5 20 X 50 Rl 5 Yes No 78 .80 6 20 X 50 RR 29.90 Yes No 45 .73 7 40 X 40 Rl Yes Yes 54 .65 8 40 X 40 Rl Yes Yes 57 .74 9 40 X 100 Rl 10 Yes No 308 5.18 10 40 X 100 Rl 10 Yes No 350 6.18 11 40 X 100 RR 71.02 Yes No 156 2.76 12 60 X 60 Rl Yes Yes 93 1.23 13 60 X 150 Rl 15 Yes No 516 13.44 14 100 X 250 Rl <25.44 No — 1209 60 tFORTRAN code given in Land and Powell [10]. OPTIMAL SET PARTITIONING 561 In Table 1 a distinction is made between a cycle and an iteration. Each time F(X) is linimized with respect to all of Xn, .... \i /j^^^i, . . . , \„i, . . . , ^n.K„-\ ^^ that order, one at a me while the others are fixed, a cycle is completed. However, if when minimizing F(k) with jspect to say Xj^ we have that Xji^ ^ ^j.k+i then this is considered an iteration. Thus, there are otentially as many as m/2 iterations per cycle. Loosely speaking, a cycle in the cyclic coordi- ate method corresponds to an iteration in the subgradient method. Very seldom does an algorithm perform uniformly better than another on all problems, nd the three methods tested are no exception to this rule. Each out-performs a competing lethod on at least one of the fourteen problems tested. However, certain general observations an be made. The cyclic coordinate method performs much slower than anticipated, although it oes do better than the subgradient method on problem 11. Not surprisingly, linear program- ling was highly successful in all randomly generated problems except problem 14, the largest ne, in which it was inferior to the other two methods. For this problem, linear programming iiled to reach an optimum in one minute, while the other two methods were each able to pro- ide useful information since several matching problems were able to be solved in one minute. We are not discouraged by the fact that the linear programming method out-performs the yclic coordinate and subgradient methods on the majority of the test problems. The results of roblem 3 indicate that there could be a class of problems in which, regardless of the size, the >'clic coordinate and subgradient methods are superior to linear programming, and perhaps lore importantly, problem 14 indicates that perhaps for large problems the methods developed ere could be a viable alternative to those algorithms that use linear programming. CKNOWLEDGMENT We would like to thank Jack Edmonds for many helpful suggestions, particularly with jgard to sensitivity analysis of the optimal matchings. [1 REFERENCES Balas, E. and M.W. Padberg, "Set Partitioning," pp. 205-258 in B. Roy, ed., Combinatorial Programming: Methods and Applications, (D. Reidel Publishing Co., 1975). Bazaraa, M.S. and J.J. Goode, "A Survey of Various Tactics for Generating Lagrangian Multipliers in the Context of Lagrangian Duality," School of Industrial and Systems Engineering, Georgia Institute of Technology (1974). Edmonds, J., "Path, Trees, and Flowers," Canadian Journal of Mathematics 7 7, 449-467 (1965). Edmonds, J., "Maximum Matching and a Polyhedron with 0,1-Vertices," Journal of Research of the National Bureau of Standards 69fi, 125-130 (1965). Edmonds, J., "An Introduction to Matching," notes on lectures given at Ann Arbor, Michigan (1967). Fisher, M.L., W.D. Northup and J.F. Shapiro, "Using Duality to Solve Discrete Optimi- zation Problems: Theory and Computational Experience," Mathematical Programming Study i, 56-94 (1975). Geoffrion, A.M., "Lagrangian Relaxation for Integer Programming," Mathematical Pro- gramming Study 2, 82-114 (1974). Held, M. and R.M. Karp, "The Traveling-Salesman Problem and Minimum Spanning Trees: Part II," Mathematical Programming 7, 6-25 (1971). Held, M., P. Wolfe and H.P. Crowder, "Validation of Subgradient Optimization," Mathematical Programming 6, 62-88 (1974). 562 G.L. NEMHAUSER & G.M. WEBER [10] Land, A.H. and S. Powell, Fortran Codes for Mathematical Programming: Linear, Quadratic and Discrete (John Wiley and Sons, 1973). [11] Luenberger, D.G., Introduction to Linear and Nonlinear Programming, (Addison-Wesley, 1973). [12] Marsten, R.E., "An Algorithm for Large Set Partitioning Problems," Management Science: 20, 774-787 (1974). [13] Weber, G.M., "A Solution Technique for Binary Integer Programming Using Matchings on Graphs," Ph.D. Thesis, Cornell University (1978). [14] Weber, G.M. , "Sensitivity Analysis of Optimal Matchings," TR No. 427, School of Opera- tions Research and Industrial Engineering, Cornell University (May 1979). [15] Zangwill, W.I., Nonlinear Programming: A Unified Approach (Prentice-Hall, 1969). k APPENDIX PROOF OF THEOREM 1: The proof of Case 2 parallels Case 1 and Case 4 parallels Case 3; thus the proofs of only Cases 1 and 3 are given. Throughout the proof we use the fact that a change in X % to \y^ changes the objective function coefficients of x,^ and x^^^+i in {LR^) by A. *;f — Xj^ and \,/^ — \ %, respectively. CASE I: By definition F{k**) = max G{\**, x) ^ G{k**, x*). Since x% = x*t+i = 1, X for any X identical to X* except for the Xj^ component, G(X, x*) = G(X* x*) = Fik*). In particular, G(X**. x*) = F(X *) so that Fik**) ^ Fik*). Now Fik**) = min [Fik): X = X* except for the k^i, component} < Fik*). Hence Fik**) = Fik*) and X** = X*. CASE 3: Let X = X* and consider continuously increasing Xy^ from k*i^ and altering the matching x* only if Gik, x) would increase by doing so. Let kji^ be the value of kn^ when Xji^ first becomes 0, if such a X^^ exists, otherwise set X^^ = <». Let X,;^ be the value of X^^t when x^^+i first becomes 1, if such a X,^ exists, otherwise set X,^ = °°- Let a = min(X/^., X^^) and b = max(X,7,, X,^). Note that X^^; has been chosen sufficiently large so that if a or 6 are finite, they are smaller than X^;^. Thus X,L = k,^ finite 1 otherwise "^'''+' otherwise ^ kfi, finite. As long as x,i^ = 1, there is a unit decrease in the objective function per unit increase in X,^ and when x,(^+l = 1 there is a unit increase in the objective function per unit increase in X/^. Thus, -1, • if ^% ^ ^jk < a 0, if a < kjk < b 1' \{ b < kj, ^ kj,. (1) dFik)/dkj, = In Case 3(a), we have a = °° so that Fik) decreases monotonically with X^^ and the dual is unbounded ik**= °°). OPTIMAL SET PARTITIONING 563 In Cases 3(b,c), a is finite and b = oo. We can set K**= a. From (1), Fi\) - F(X*) = -{a - \jk) so that X*;= a =\% + Fix*) - Fix). In Case 3(d), a and b are finite. We can set \** = ia + b)/2. From (1), Fix) - Fix*) = -ia - X%) + iXfk - b) so that X%*= ia + b)l2 = iX% + X,, + Fix *) - FiX))/2. D I w A COMPLETE IMPORTANCE RANKING FOR COMPONENTS OF BINARY COHERENT SYSTEMS, WITH EXTENSIONS TO MULTI-STATE SYSTEMS David A. Butler Oregon State University Corvallis, Oregon ABSTRACT Means of measuring and ranking a system's components relative to their importance to the system reliability have been developed by a number of au- thors. This paper investigates a new ranking that is based upon minimal cuts and compares it with existing definitions. The new ranking is shown to be easily calculated from readily obtainable information and to be most useful for systems composed of highly reliable components. The paper also discusses ex- tensions of importance measures and rankings to systems in which both the system and its components may be in any of a finite number of states. Many of the results about importance measures and rankings for binary systems are shown to extend to the more sophisticated multi-state systems. Also, the multi-state importance measures and rankings are shown to be decomposable into a number of sub-measures and rankings. j Given a system composed of many components, a question of considerable interest is which components are most crucial to the proper functioning of the system. In response to this question, a number of importance measures and rankings have been proposed [3], [4], [5], [10]. This paper investigates a new ranking and compares it to existing rankings, principally the ranking induced by the Birnbaum reliability importance measure. The new ranking is based upon minimal cuts and provides a complete ordering of all the system's components relative to their importance to the system reliability. This ranking has three key points in its favor: (i) the ;alculations involved require only readily obtainable information; (ii) the calculations are usu- ally quite simple; and (iii) the ranking is designed for use with systems consisting of highly reli- ible components, the most common case. The final section of the paper deals with extensions of importance measures and rankings :o systems in which both the system and its components may be in any of a finite number of itates. Many of the results about importance measures and rankings for binary systems esta- Mished in preceding sections are shown to extend to the more sophisticated multi-state systems. Mso, the multi-state importance measures and rankings are shown to be decomposable into a lumber of sub-measures and rankings. jl. DEFINITIONS OF COMPONENT IMPORTANCE-BINARY SYSTEMS Consider a binary coherent system of n independent components with structure function Hx) and reliability function h{p). (Our definitions, terminology and notation regarding binary coherent systems follow those of [2]0 565 566 DA. BUTLER Birnbaum [4] and Barlow and Proschan [3] have each proposed rehabihty importance measures, so called because they make use of probabilistic information about the components. The Birnbaum reliability measure of the importance of component / is Ii,(i\p) = bh{p)lbp,. The Barlow-Proschan reliability measure requires the time-to-failure distribution for each com- ponent, and the importance of a component / using this measure can be interpreted as the pro- bability that component /causes the system to fail [3]. The two sets of authors have also proposed structural importance measures, so called because they require only a knowledge of the system structure function to be calculated. This feature gives them an important practical advantage over the more sophisticated reliability importance measures, because often the more detailed knowledge required for the calculation of these latter measures is unobtainable. Both structural measures can be derived from the Birnbaum reliability importance measure, assuming a common reliability p for all components. Specifically, the Birnbaum structural importance measure is the Birnbaum reliability importance measure evaluated at p = 0.5 [4]. The Barlow-Proschan structural importance measure [3] is the "average" (integral) of the Birnbaum reliability measure as p ranges over [0,1]. But for most systems the typical component's reliability is not 0.5 or even 0.5 "on the average," but rather is much higher. This is especially so for systems with complex structure functions incor- porating redundancy, because redundancy in the design of a system is usually only incorporated if a non-redundant design with highly reliable components cannot produce a satisfactory level of system reliability. Thus using either of these two measures to compare components of such systems may give a misleading picture of which components are most important. It would seem desirable, therefore, to develop a measure or ranking that is structural (i.e., is based solely upon the system structure function and therefore not upon p), yet is somehow related to the Birnbaum reliability importance measure for high values of p. The ranking proposed in this paper is such a result. This ranking is based upon cuts and provides a complete ordering of all components. It extends an earlier ranking which provided only a partial ordering of the com- ponents [5]. To introduce this ranking, consider the following example. ' EX AMPLE 1. i <f>{x) = [1 - (1 - X,)(l - X5)] • [1 - (1 - X2)(I - X3)] • [1 - (1 - X2)(l- Xft)] • [1 - (1 - X4)(I - X5)(l - Xe)] [1 - (1 - X3)(l - X4)(l - X5)] • [1 - (1 - X,)(l - X2)] Mincuts: Ci = {l,5), C2={2,3}, C3={2,6} C4= (4,5,6), C5 = {3,4,5}, Ce = {1.2}. Assuming that the components function or fail independently of one another and assuming a common reliability p for each component, the Birnbaum reliability importances can be written as IMPORTANCE RANKING FOR COMPONENTS 567 7,(1; p) = 2(1 -p) - 3(1 - p)' - (1 - py + 3(1 - p)' - (1 - p)^ /;,(2; p) = 3(1 -p) - 4(1 - p)' - (1 - p^ + 3(1 - ;;)' - (1 - p)^ 4(3; p) = (1 - ;,) - (1 - ;7)2 - 2(1 - p)' + 3(1 - p)' - (1 - p)^ 7,(4; ;;) = 2(1 - p)' - 5(1 - p)' + 4(1 - pV - (1 - p)' 7,(5; ;,) = (1 - ;,) + (1 - p)2 - 5(1 - p^ + 4(1 - p)' - (1 - /7)5 7,(6; p) = i\-p) -(\ - p)' - 2(1 - py + 3(1 - ;>)^ - (1 - p)^ Denote the component orderings induced by the Birnbaum structural importance measure, the Barlow-Proschan structural importance measure, and the Birnbaum reliability importance meas- ^are by > bS' >bpS' >s/?, respectively. (Note that >g/^ depends implicitly upon p.) Then 2 > g5 5 > g5 1 > 55 3 = g5 6 > fi5 4 : 2 > Bps 5 > Bps 1 > BPS 3 ^BPS 6 > bps 4 ■and 2 >BR S, 1 > BR ^ =BR ^ > BR '^ M all £ = {p,p p) . For p < (-1 + V5)/2 = .618, 5 >s« 1 and the three rankings are indentical. But for ip > (-1 + V5)/2, 1 >BR^- Notice that if the Birnbaum reliability measures 7/, (/; p) are written as polynomials in '(1 — p) as above, then for high values of p the lowest-order terms in the polynomial dominate 'the rest. Thus, in this example, by looking only at the lowest-order terms in the formulas for ;//,(/; p) it is apparent that for high values of p, 2 >fl/j 1. 1 >s/? 5, 1 >fi;; 3, 1 >5/j 6, 5>s;;4, 1>BR^. 6>fi^4. By examining the lowest- and the second lowest-order terms, we can further determine that for high values of p 5 >fl« 3 and 5 >a/; 6. This suggests a possible way to define a new structural ranking that would agree with > br for high values of p. It will be more convenient to define this new structural ranking in terms of the system's min cuts, rather than the coefficients in the polynomial expressions for 7/,(/; p). However, we will see that the resulting definition is equivalent to the above. DEFINITION: For each component /of a coherent system (N, 0) with ? minimal cuts, let rfj" denote the number of collections of / distinct min cuts such that the union of each collec- tion contains exactly j components and includes component /, (1 ^ / < r, 1 ^ 7 ^ «). Let V" = Y.^-\)'~^cl^/\ Let ^*" = (6/'\ ..., 6„*"). Component / is more cut-important than :omponent /c, denoted / > ^. k, if and only if 6*" )> ^*''\ where )» denotes lexicographic order- ing. Components /and k are equally cut-important, denoted I = , k, if and only if ^*" = ^"'*. This definition, although rather formidable in appearance, is in practice usually easy to apply and work with. Because the ranking depends upon a lexicographic ordering, most com- ponents can be ranked by determining only the first few components of b}\ Also, the first ion-zero component of any 6* ' is particularly easy to compute (see Proposition 1 and Corollary 1). 568 DA. BUTLER EXAMPLE 1. (continued) For / = 2, the non-zero d^/^'s are as follows: d{l^ = 3, d{f = 4, d^}^ = 4, d^f = 4, J(2)=3, rf(2)=ll, ^(2) =5, ^(2) =4^ ^4'6^'=11. ^i6^>=6. C/<6^>=1. ThUS b''^ = (0,3,-4,-1,3,-1). Similarly, 6<" = (0,2,-3,-1,3,-1), 6^^> = (0,1.-1,-2,3,-1), b}'^^ = (0,0,2,-5,4,-1), 6"^ = (0,1,1,-5,4,-1), 6^6^ = (0,1,-1,-2,3,-1). i Therefore 2 > , 1 > , 5 > , 3 = , 6 > ^ 4. * Note that ^*'' is the vector of coefficients in the polynomial expression for I^ij; p). We will show that this is so in general. Also note that the cut-importance ranking and the Birnbaum reliability importance ranking for high values of p are in agreement. 2. ANALYSIS OF THE CUT-IMPORTANCE RANKING-BINARY SYSTEMS As stated in the introduction, the cut-importance ranking has three main favorable pro- perties: (i) it is based upon readily obtainable information, (ii) is usually easily calculated, and (iii) is designed for use when component reliabilities are high. The first property is already established, since this ordering is based only upon the system structure function through the minimal cuts of the system. This section will deal with the second and third properties. The precise meaning of the third property of the cut-importance ranking is given in Theorems 1 and 2 below. The first theorem relates the cut-importance ranking to the Birn- baum reliability importance measure in the case where the component reliabilities are equal and high. THEOREM 1: For p= ip,p, ..., p) where the scalar p is sufficiently close to one, the orderings >a and >c are identical. PROOF: The above is a direct result of Lemma 1 which follows. Using the lemma, it is clear that I =^ k if and only if /,,(/; p) = Ii,{k,p) for all p. [A scalar second argument is used in Ii,ik; p) when p= ip, ....p).] Also, I >, k if and only if ^*" - ^<*' > 0, and ^(/) _ ^(k) \, g if and'only if //,(/; p) > I/,(k; p) for all p sufficiently close to one. D LEMMA 1: /^^(/. p) = £^^(/) (/ _ ^)7-i. PROOFS hip)=Pr I c^E. where £, denotes the event that at least one component in /''' min cut functions. Thus hip) = \- Pr U El By the inclusion-exclusion principle ([8]; pp 98-101), hip) = 1- t(-l)'-'5„ IMPORTANCE RANKING FOR COMPONENTS 569 ivhere Sow the event Ej^ f] E-^ f] ■•■ p) EJ^ is the event that all components k € C,^ [J Cj^[J-- \J Cj^ fail. rhus using the independence assumption, K.yi<./2<--</,<' n K^c,^Uc,^U Uc, (1 - P.) vhere C\, . . . , C, are the minimal cuts of the system. Thus bhip) hM\p) = 'ec,^ U c-,^ U U c,^ n (1 - p,) lecalling P\ = P2 = ■■■ = P„ = P-, and the definition of df-'' , ' = ly = l = ±b^'H\-pV-\ 7 = 1 '(I) n Theorem 1 establishes the relationship the cut-importance ranking has to the Birnbaum eliability importance ranking in the case of high and equal component reliabilities. We now onsider the case where the component reliabilities are high but unequal. Let pie) be a vector- 'alued function of the positive scalar e for which < p,(€) < 1 for all e € (0,oo) and ^ / < «. Let lim p{e) = 1. Unfortunately, it is not true in general that the component ord- €—0 — ~ ■ring induced by //,(;p(e)) coincides with >^ for all e sufficiently close to zero. However, vith some additional assumptions on pie) some partial results along these lines are possible. Mrst we establish a simple and computationally convenient formula for the first non-zero coor- iinate in any vector ^*". PROPOSITION 1: For each component /c, let e,, be the cardinality of the smallest ninimal cut containing component /c, and let /^. be the number of minimal cuts of cardinality 'k containing k. Then (i) ei^ = m\n[j:bl'^^ 7^ 0}, and (ii) f^ = */*'. PROOF: By definition ^z**' = /^. Also any union of two or more minimal cuts at least me of which contains k must have cardinality at least e^ + 1. Thus ^(j** = for all / ^ 2. 'herefore be''' = I (-!)'-'<' =/.• /=! ilso, since component k is contained in no cuts of cardinality smaller than e^, (/,J'^^ = for all < e^. Thus ft/*^* = for j < e^. D 570 DA. BUTLER COROLLARY 1: (i) If e, < e,,, then / > , A:. .^ (ii) If e, = ei, and /, > f^, then I > , k. THEOREM 2: Assume that for some Mj, M2 € R, 1 9i(«) 1 — — < — 7^ < — — for all sufficiently small e. M, 9^(e) M2 If either (i) e, < e,,, or (ii) ei = e^ and ifilfk) > {MjMj) * , then there exists an e > such that I,,il;p{€)) > I,,(k;p(e)) for all e < e. PROOF: See Theorem 2 in [5]. Further results along these lines are surely possible, but their value is questionable because the hypotheses become too complex. From a practical standpoint, users of the cut-importance ranking should be aware that while the cut-importance ranking can be useful even when component reliabilities are unequal, it may be misleading if the differences in the orders of magnitude of the unreliabilities are too great. To summarize, the reliability importance measures and rankings probably give generally superior results to the structural ones and should be used unless (i) the probabilistic informa- tion required for their calculation is not available or (ii) computations involved are prohibitively extensive. However when one or the other of these conditions prevails, a structural measure or ranking must be employed. The Birnbaum or the Barlow-Proschan structural measure can be used but we have seen that doing so is equivalent to using the Birnbaum reliability importance measure //, (/;p) with p = 0.5 or 0.5 "on the average." If one feels that the component reliabili- ties, although not precisely known, are high, then the cut-importance ranking seems preferable, since, as Theorems 1 and 2 have shown, its results are the same as those given by the Birn- baum reliability importance ranking for high values of p. We now turn to the question of the computational complexities involved in determining the cut-importance ranking of a system's components. It is clear that the task of computing the entire vector ^"'* for each component k can be a formidable one for a complex system with many minimal cuts. However, Proposition 1 and Corollary 1 show that components often can be compared by only determining the easily computed quantities e^. and fi^. For instance, in Example 1 it is possible to determine that 2 >, 1 > , 5, 3, 6 > < 4 in this manner. Also, since the structure function is symmetric in x^ and X(„ it is clear thai 3 = , 6. Thus additional calculations are necessary only to compare components 3 and 5. The ordering of these two components can be determined by computing the next entries in ^*^' and b^^\ namely 63"' and b^^K The last three entries in each vector ^' * are irrelevant for the pur- poses of ranking the components in this example. In general, most components can be compared by determining the first non-zero entry in ^' ' via Corollary 1. Other entries in ^* ' are computed only as necessary. Computations can also be simplified when the system under consideration contains modules. PROPOSITION 2: Let {A,x) be a module of iN,ct>) and let <p{x) = .//(x(x^),x^'). Let *^* ', ^b^\ and '''^* * be the ^' * vectors corresponding to the structures 4>, x, and t//, respec- tively. Then IMPORTANCE RANKING FOR COMPONENTS *^/''' = Z *^/" ■ ""^l-'+i for ^'l k e A. 571 vhere the definition of ^b^^ is extended to include zero coordinates for / > \a\, and '''^*" is extended similarly. (The above equation is just an expression of the fact that '*^"^' is the con- /olution of the finite sequences '^^*" and ^^**'*.) PROOF: In the remainder of the paper the dependence of //,(/;/?) upon p will at times be suppressed and the notation simplified to //,(/). Let //(), //^(O, and I,f(-) denote the respec- ive Birnbaum reliability importance measures. These three quantities are related as follows [4]. lit(k) = I,fil) ■ iMk) for all k e A. fhus by Lemma 1, f^'^bl'^l-p)'-^ = Ml + l 2: *Z)/"(i-p)'-' \A\ = 1 / = 1 I F^'" • ^^AV, (1-p) /-I )ince this equality holds for all < ;? ^ 1, each pair of coefficients of the two polynomials nust be identical. n Proposition 2 can be applied to make the calculation of the cut-importance component anking simpler when the system contains modules. I EXAMPLE \. (continued) Components 3 and 6 form a module. A = {3,6}, xix") = x^Xf,, iliiz.x"*') = !-(!- x,X2X4)(l - x,z)(l - X2X5). ^i6"' = (1,-1), ^b^^^ = (1,-1), *6<" = (0,1,0,-2.1). By using the concept of the dual of a coherent system, it is possible to develop a com- )onent ranking analogous to > ,., but based upon minimal paths instead of minimal cuts. This )rdering can be shown to be identical to the ordering induced by //,(;/?) when p is sufficiently mall. I. COMPONENT IMPORTANCE IN MULTI-STATE SYSTEMS This section deals with extensions of the results of the preceding sections to systems in vhich both the system and its components may be in any of a finite number of states. Of curse, any such increase in the sophistication of the model used to represent a real system :ntails disadvantages as well as advantages, and this extension to multi-state models is not ntended to suggest that binary models are generally inadequate. To the contrary, in most cases hey suffice quite well. However, in some instances a small increase in the number of states say, to three or perhaps four) can result in a much improved model. One of the main difficulties with multi-state models is the increased notational complexity, ^''or this reason and for the reason that the number of states in a practical model must be kept mall if the model is to be manageable, the following definitions will be given for ternary three-state) systems; however, they will be given in a manner that illustrates the extension to 572 DA. BUTLER general /j-state systems. Whenever the extension of a definition or result to w-state systems is unclear, some further explanation will be given. The study of multi-state systems is a relatively new area in reliability theory. Most articles in this area have dealt with generalizing particular classes of results [1], [7], [9], [11], [12]. The most general paper in the area is Barlow's [1]. Let Xj denote the state of component j, {Xj = 0, 1,2, 1 ^ y < n). Given a collection of minimal cuts C\, Cj, ■■■ , C, which define the system structure, Barlow defines the system state (fiiX) as the state of the "best" component in the "worst" min cut, i.e., ^ 0(A') = min {max {A',}}. ^ Let Zj = I[x ^k] ^nd let i// = I\^(x)^k]- Both Z and »// are binary, and t// is a function only of Z. Because of this property, most results about binary coherent systems have immediate generali- zation under Barlow's extended definition. However, there are many more reasonable choices for the structure function in a multi-state setting than Barlow's definition allows (see [6] for some examples). To accommodate such choices, a more general definition of a multi-state coherent system is proposed below. Let 5 = (x G IR":x, = 0,1,2), and let (•,,x) = (x,,jc2, .... x,_i,-, x,+,, ..., x„). DEFINITION: Component / is relevant if and only if 0(2, ,x) ^ 0(0, ,x) for some x 6 S. Otherwise component / is irrelevant. Component / is fully relevant if and only if 0(2, ,x) j^ 0(1, ,x) for some x € Sand 0(1, ,y) ^ 0(0, ,j') for some y ^ S. DEFINITION: A structure function is coherent if and only if (i) 0(0) = 0; 0(2) = 2, (ii) 0(x) is non-decreasing in x, (iii) each component is relevant. The ordered pair (M 0) is called a (generalized or ternary) coherent system. If a component is not fully relevant, then only two states are required to describe its status. Such components are permissible in a generalized coherent system to allow for a mix- ture of binary and ternary components. Define the matrix P = [p^j] by Pii = Prfcomponent / is in state j], 1 ^ / ^ /j, < y ^ 2. The reliability function, h{P), is defined by h{P) = Fr{0'(X) ^ m], where < w ^ 2. The effect of this generalized definition of the reliability function is to con- sider systems whose components have several states but whose structure function effectively has two states (^w or <w). All subsequent definitions and results are for a fixed value of m. (For simplicity, the dependence of /?(•) upon m is suppressed in the notation.) For any matrix A = [a,y], let (,ki,A) denote the matrix whose i-j\h entry is given by IMPORTANCE RANKING FOR COMPONENTS 573 {ki,A)„ = «'/ / ^ I, 1 I = I, J = k i = I, J ^ k. DEFINITION: The r,s reliability importance of component i, denoted by I!;'{i\P), is given /r(/; P) = h{r,,P) - h{s,,P), vhere r,s =0,1,2 and r > s. The 2,0 reUabiUty importance will sometimes be simply called he reliability importance and be denoted by //,(/; P). The r,s reliability importance of com- )onent / is the probability that the system is in state m or better given component / is in state r ninus the probability that the system is in state m or better given component / is in state 5. DEFINITION: A vector x 6 S is r,s critical for component i if and only if (f>ir,,x) ^ m and bis,,x) < m. (r,s = 0, 1,2; r > 5) DEFINITION: Let w^^ = |{x € 5: x is r,s critical for component /}|. The r,s structural mportance of component i, /JiM'), is given by |n the following, whenever a vector p € JR.^ appears in an expression normally involving the natrix P, P will be understood to be the matrix all of whose rows are equal to p. PROPOSITION 3: (i) /,20(/) = /^i(/) + /,i.0(,) (ii) /2.0(/) = /2.1(/) +/^0(/) (iii) /^M/) = /r(/; (1/3,1/3,1/3)) PROOF: The proofs of (i) and (ii) are trivial. To prove (iii), note that by summing over lie 3""^ possible values for Xi^, k ^ i, h(J,. (1/3.1/3,1/3)) = 3-"+' £ /«(<*o,..)>,„l x,=./ = 3"" ^ /"(<JO-.x)>ml xe5 'here in the above, In^ denotes the indicator function of the set A. Thus /;,2'(/;(l/3,l/3,l/3)) = 3-" X [/«(^(2,,.)>,„i - /«i^(i, ... )>.,)] x65 = 3-"nyii) = llHi). the proof for /^"^ (•) is the same and the proof for I^^ (•) follows from parts (i) and (ii). I D Parts (i) and (ii) of the above result show that both 2,0 importance measures decompose j|Uo the sum of the 2,1 and 1,0 importance measures. The generalized cut-importance ranking |) be defined later has a similar property. In practice, it is likely that the 2,0 measures and inkings would be the most commonly used. However, the other measures and rankings can 574 DA. BUTLER be useful in providing more detailed information about which states are most relevant in deter- mining a given component's ranking. (See Example 2.) Given a generalized coherent system {N,(f>), and a partition C = (Cq, Ci, C2) of A^ into three sets, define xiO € Sby '• € Co (x(e)),= 1 / e C, 2 / e C2. The function x((2) shows how any partition (2 determines the states of all the components. DEFINITION: A partition C = (Cq, C,, Cj) of A^ is a cut if and only if <^(x(C)) < m. A cut C is a minimal cut if and only \f <t>{y) ^ w for all 3^ € S such that y ^ xiC) , y ^ x(C). While it is in principle possible to develop a complete cut-importance ranking for general- ized coherent systems, in practice the calculation of the entire generalized 6* ' vector for each component is too complex to be feasible. However, a partial ordering of the components which involves very few calculations can be developed by generalizing Proposition 1 and Corollary 1 appropriately. First, the notions of the size of a partition and the union of partitions must be defined. ■ii DEFINITION: The size of a partition C = (Cq, C,, C2), denoted by z((f), is "oIQI + «i|Ci|. (ao' "1 ^re arbitrary constants satisfying ao > aj > 0.) The roles of the constants ao, ai are dicussed later. DEFINITION: Let J', J^ J' be partitions of A^ where |tt T= in. T[, T'2). The union of J', .... J' is the partition where V,= \J T,. /=i DEFINITION: Consider a ternary coherent system with minimal cuts C' = (Cq, C\, C2), / = 1, 2 t. For each component /c, let ?(+'•'= min {z((?'):A: € C^ - a,. and let /r''^= \{(^':k € Q.ziC) -a, = er'^ 1 ^ ' <t\\- (By convention e^^' ' = + 00 if /c ^ C;, 1 ^ / < r ; el,^^-' is just the size of the "smallest" min cut which contains k in the r^^ set of the partition less a,, and Z^"^' '^ is the number of such min cuts.) For all r.s = 0,1,2 such that r > 5, define €[■'= min {e^"+'"} and s ^u <r ■U + \.U IMPORTANCE RANKING FOR COMPONENTS 575 Component / is more r,s cut-important than component k, denoted / > ^ ^ k, if and only if either (i) er < el^\ or (ii) er = er and f^ > f^ The 2,0 cut-importance ranking will sometimes be simply called the cut-importance ranking and be denoted by >(.■ As in the binary case, each of the r,s importance rankings is consistent with the ranking induced by the corresponding r.s importance measure. To be more specific, let /^(c) = (e"°, e"',l — e"" — e"'). As e approaches zero, pie) puts almost all its mass on the best state, state 2. Of the mass left over, the ratio of the'mass put on state to that put on state 1 approaches zero. Thus the parameters ao and ai give the relative weights put on components in state zero versus components in state 1 in the cut importance ranking and also determine the relative likelihoods of a component partially failing (state 1) and fully failing (state zero). THEOREM 3: For e sufficiently close to zero, the component ranking induced by ir{-\p{e)) is consistent with the r,s cut-importance ranking (r > 5). PROOF: U,^^U\pU)) = hiir + l),,pU)) - h{r,,pU)) = \ - Pr U E,\X, = r + \ 1 - Pr U E,\X, = r\ A'here E, = [X ^ x{C')}. Thus by the inclusion-exclusion principle, vhere ir'U;pie)) = £ (-1)'-' [S;- 5;+'], 5; - sr' = z Pr n^,u^ = '- - Pr (^E.\X, = r + \ /=i ind J, = {iJij2. •••, J,)-- 1 ^ ii < Ji <•••< J, < r)- Letting G = {j eJ,: min {XkiC")] = r}, — 1 ^ /^ / nE,i,\X, = r >ecause the two probabilities in the first expression for S, — S^^ are equal for all j € J, — G since the X/s are independent) and for J ^ G the second probability is zero. s[ - sr ' = YP'- J^G X„ ^ min [X„i(^'')], \ ^w ^ n\X^ = r !</<( = £ Pr[X ^ x(a))|A'^ = r] ■ € 'ol^ol . / «0 ^ ^«lJ^ll = 1^ Uere H = {j € J,: k € D,} and where fD = (Dq, D,, D2) is the union of C'^C C'. jj(y the definitions of ei^ and /^, the lowest order term in this polynomial expression for '1 — Si"*"' is fkie) *. [For notational simplicity the superscript r -\- I, r which should appear on ,-, fk, > c, and //, will be dropped in the remainder of the proof.] Thus 576 DA. BUTLER Next we show that 5; - S,'"^' = oU^'') for all / ^ 2. Let 3) be the union of any / minimal cuts(?'', ..., (?'' satisfying A: € D,. Then \Do\>\ci^\ and (2) |/)ol + |/)il> ld'l + |C|M. Assume that the above two inequalities simultaneously hold as equalities. Then \D(,\ = |Cn' |, which implies that Dq = Cq' and \D]\ = IQ' |, which implies that Z), = C\\ Thus 3) = C' and so x(3)) = x((?''). Now as an immediate consequence of the definition of 3). x(3)) < xiC), and so xiC) < xiC). Furthermore, C ^ C'\ so the inequality must be strict in at least one coordinate. But this contradicts the assumption that the cut C' is minimal, and so at least one of the inequalities in (2) must be strict. Now the lowest power in the polynomial expression for 5" — S,'^^ is ao|l>ol + "il-^il ~ «r- But ao\Do\ + a,|Z),| - a, = (ao - a,)|Z)ol + "id^ol + l^iD " «, t > (ao-a,)|C^'| +a,(|C^'| + |C|'|)=a, > aolCo'l +a,|c{M -a,^ e,. f) Thus Thus, by equation (1) (3) I>,(k;p{€)) =h ■^"' ^o{^'^. Now assume that />,./:. If e/ < e^, then by equation (3) I„il;p(€)) - h(k,pU)) = // • «'' + o(e''). Thus for € sufficiently close to zero this expression is positive and so the two orderings of / and k are identical in this case. If ei = ei^ and // > fi^, then again by equation (3) /;,(/;M€)) - h{k;pU)) = (/, - /,)€'' + o(e''). Thus in this case, also, the two orderings are consistent for e sufficiently small. This establishes the theorem for all the r + \, r orderings. To establish the result for any r, s ordering note that /;-(/c;M6))= I/r'"(/c;M€)). u=s (See Proposition 3, part (i).) Combining this result with equation (3), IMPORTANCE RANKING FOR COMPONENTS 577 The remainder of the proof is identical to the r + 1, r case. [Note. For ternary systems, the only r + \, r orderings are the 2-1 ordering and the 1-0 ordering. The only other r, s ordering is the 2-0 ordering. The notation in the proof and the definition of r, s cut-importance has been kept more general so that the extensions to general rt-state systems can be more readily understood.] n EXAMPLE 2: In the following diagram, the states are shown in a lattice arrangement according to the less-than-or-equal-to relation. 0(2,2) =2 0(2,1) =2 0(1,2) =2 0(2,0) =2 0(1,1) = 1 0(0,2) = 1 0(1,0) = 1 0(0,1) =0 0(0,0) =0 Consider the case where An=2. ao = 2,ai = l. Vlin cuts: (?' = (£, (1,2), E) (£" denotes the empty set.) e2=({l}, E{2}) ,2,1 _ = l;e2^' =1 //•' =l;/2^' =1 2,1 _ 1,0 = 0;e'" = -f yl,0 = J. yl.O = e2.o = 0; el" = 1 Components 1, 2 are not comparable under the 2, 1 ranking. 1 >J0 2 -2,0 _ //■" = i;/2^ 2,0 _ 1 > 2,0 hiP) = I - ipio + Pl\)(P20 + P2\) - P\0 + PwiPio + Pl\) = 1 - PwiPlO + P2\) - P\0- ifHUpie)) =€' + e. I,}'(2-p{€)) =6. I h}\\-pU)) = \ - e - €\ I,:H2-pU))=0. h'H\;pie)) = \. /;20(2;p(e)) = e. ■bus component 1 is more important than component 2 in an overall sense (i.e., according to le 2, cut-importance ranking). Moreover, the 2, 1 and 1, rankings of the components 578 DA BUTLER show that it is the state 1 to state transition of the components which determines the 2, ranking here. As was the case for binary systems, analogous results based upon minimal paths can be developed for ternary systems composed of very unreliable components. 4. CONCLUSIONS Reliability engineers usually know or can calculate the minimal cuts of the systems with which they deal. However, the component reliabilities, though usually thought to be fairly high, are often not known with any degree of precision. The cut-importance rankings developed in this paper are structural rankings, i.e., depend only upon the system structure through the minimal cuts. Also they are defined so as to relate closely to the Birnbaum reliabil- ity importance measure when the component reliabilities are high. Thus they provide reliability analysts and engineers with ways to meaningfully compare the relative importance to the system reliability of the various components of the system. REFERENCES [1] Barlow, R.E., "Coherent Systems with Multi-State Components," University of California Operations Research Center Technical Report ORC 77-5, Berkeley, California (January 1977). [2] Barlow, R.E. and F. Proschan, Statistical Theory of Reliability and Life Testing: Probability Models (Holt, Rinehart, and Winston, 1975). [3] Barlow, R.E. and F. Proschan, "Importance of System Components and Fault Tree Events," Stochastic Processes and Their Applications, Vol. 3, pp. 153-172 (1975). [4] Birnbaum, Z.W., "On the Importance of DiflFerent Components in a Multi-Component Sys- tem," in Multivariate Analysis— II, PR- Krishnaiah (ed.) (Academic Press, New York, 1969). [5] Butler, D.A., "An Importance Ranking for System Components Based upon Cuts," Opera- tions Research, Vol. 25, No. 5, pp. 874-879 (1977). [6] Butler, D.A., "A Complete Importance Ranking for Components of Binary Coherent Sys- tems, with Extensions to Multi-State Systems," Technical Report No. 183, Department of Operations Research, Stanford University, Stanford, CA (1977). [7] El-Meweihi, E., F. Proschan and J. Sethuraman, "Multi-State Coherent Systems," Florida State University Statistics Report M434 (October 1977). [8] Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., pp. 98- 101 (John Wiley and Sons, New York, 1968). [9] Hatoyama, Y., "Fundamental Concepts for Reliability Analysis of Three-State Systems," unpublished manuscript. Department of Operations Research, Stanford University (1976). [10] Lambert, H.E., "Measures of Importance of Events and Cut-Sets in Fault Trees," Lawrence Livermore Laboratory UCRL-75853 (October 1974). [11] Murchland, J.D., "Fundamental Concepts and Relations for Reliability Analysis of Multi- State Systems," Reliability and Fault Tree Analysis, Society for Industrial and Applied Mathematics (1975). [12] Postelnicu, V., "Nondichotomic Multi-Component Structures," Bulletin Mathematique de la Societe des Sciences Mathematiques de la Republique Socialiste de Roumanie, Vol. 14, (62), No 2, pp. 209-217, (1970). A DIFFUSION MODEL FOR THE CONTROL OF A MULTIPURPOSE RESERVOIR SYSTEM Dror Zuckerman* Department of Operations Research College of Engineering Cornell University Ithaca, New York ABSTRACT This paper develops a methodology for optimizing operation of a multipur- pose reservoir with a finite capacity V. The input of water into the reservoir is a Wiener process with positive drift. There are n purposes for which water is demanded. Water may be released from the reservoir at any rate, and the release rate can be increased or decreased instantaneously with zero cost. In addition to the reservoir, a supplementary source of water can supply an unlim- ited amount of water demanded during any period of time. There is a cost of C, dollars per unit of demand supplied by the supplementary source to the /' purpose (/ = 1, 2 n). At any time, the demand rate /?, associated with the / purpose (/ = 1. 2 n) musi be supplied. A controller must continu- ally decide the amount of water to be supplied by the reservoir for each pur- pose, while the remaining demand will be supplied through the supplementary source with the appropriate costs. We consider the problem of specifying an output policy which minimizes the long run average cost per unit time. . INTRODUCTION AND FORMULATION Complex systems of reservoirs today are used to produce supplies of water for agriculture, idustry and urban use. In addition, the production of hydroelectric power is usually a major bjective of water resource systems. An excellent account of the theory of storage systems, describing results obtained up to ?64 is contained in Prabhu's paper [9]. Considerable progress has since been made in several rections, but most of the models are descriptive, rather than control-oriented. Dynamic pro- amming models for the optimal control of multipurpose reservoir systems have been pro- )sed by Hall, Butcher and Esogbue [4], Russell [10] and many others. Most of the models volved discrete time analysis. Meanwhile other authors, notably Bather [1], Faddy [2], [3], aslett [5] and Pliska [8], have developed diffusion models for the control of a dam with finite servoir capacity, where the optimality was defined in terms of a cost (or a utility) structure iposed on the operation of the system. The main purpose of this article is to provide an addi- 3nal insight into the nature of the optimal controls. low at The Hebrew University of Jerusalem. lis research was supported in part by the National Science Foundation under Grant MBS 73-04437. 579 580 D. ZUCKERMAN In the present study we develop a methodology for optimizing operation of a multipurpose reservoir with a finite capacity P^ described by the following model: The input of water into the reservoir is determined by a Wiener process with positive drift fi and variance o-^. There are n purposes for which water is demanded. Water may be released from the reservoir at any rate /?, (R ^0). Let R, be the demand in units of water per unit time associated with the /'^ pur- pose (/ = 1, 2, . . . , n). At any time the release rate may be increased or decreased with zero cost, any such changes taking effect instantaneously. In addition to the reservoir, a supplemen- tary source of water can supply an unlimited amount of water demanded during any period of time. There is a cost of C, dollars per unit of demand provided by the supplementary source to the /"^ purpose, (/ = 1, 2, . . . , n). We assume without loss of generality that Cj ^ C2 ^ . ^ C„. At any time, the demand rate R, associated with the i^^ purpose (/ = 1, 2, . . . , n) must be supplied. A controller must continually decide the amount of water to be supplied by the reservoir for each purpose, while the remaining demand will be supplied through the supple- mentary source with the appropriate costs. We consider the problem of specifying an output policy which minimizes the long run average cost per unit time. An example will be presented to illustrate computational procedures. 2. THE MODEL Let us denote by X(t) the input into the dam during the time interval (0,t]; as indicated earlier [X(t);t ^ 0} is a Wiener process. By an appropriate choice of units, we may assume without loss of generality that fi = \, a-^ = 2. Note that negative values of the storage level (as in [1]) have to be taken into account.! This representation is relatively crude, but a solution to the problem of optimal control is still ■ useful, since control is needed only when the storage level is positive and we necessarily have for non-positive values of the storage level process that the demand associated with the n pur- poses will be supplied totally through the supplementary source, under any permissible output policy. If we assume, as in Pliska's paper [8] that is a reflecting boundary, then the expected time that the dam is empty (dry) over any given period is 0, independently of the output policy which is employed. The above situation seems to be unrealistic for most reservoir models. Furthermore, the multipurpose reservoir model which is considered by us becomes meaning- less, since the optimal policy in this case is to supply the demand associated with the n purposes totally through the reservoir and the resulting average cost associated with the above policy will be zero. In view of this, the Bather model in our case seems to be more appropriate. It will be very helpful for us to restrict the state space of the storage level process to a finite interval (in order to apply some results obtained by Mandl [6] and Pliska [7]). Therefore we make the following modification: Assume that the storage level is bounded from below by — 1, where —1 is an elementary return boundary. That is, when the trajectory of the storage level process reaches the boundary —1 it remains at —1 for a random amount of time which possesses the exponential distribution with mean 1, and after the termination of the sojourn time on the boundary —1, the process jumps into position with probability L Clearly, the expected transition time from —1 to is the same as under the original process. Thus, since our goal is to minimize the total long run average cost per unit time, the above modification of the storage level process does not aff'ect the decision problem (it is just a mathematical tool). Now let us consider the set of admissible output policies. In selecting the output policy only the current state, that is level of water in the reservoir, is important. The particular time DIFFUSION MODEL FOR CONTROL 581 s irrelevant since the input process is time homogeneous and since we are concerned with an nfinite future. Thus we consider only policies T such that kfhere r; [-1, V]-- S S = {(a,, a2, .... a„) |0 < a, < 1, for / = 1.2, n words, r(x) = (FiCx), Viix), ... , r„(x)) means that under an output policy T, when the torage level is x, 100r,(x) percent of the demand rate associated with the /'^ purpose will be applied by the reservoir, and 100(1 — r,(jf)) percent of it will be provided by the supplemen- ary source, for / = 1, 2, .... n. The set M of admissible controls consists of all piecewise continuous functions r() on - 1 , V\ with range in S. Let {Zp; r € M] be the controlled process, corresponding to the storage level in the dam /hen a policy T € Mis employed. The controlled process {Zj-; Y € M] is a diffusion process whose state space is the inter- al (—1, V), with drift parameter 1) 6r(x) = 1 - £r,(x)/?, nd diffusion parameter (see Mandl [6], p. 12) ■2) (T a,{x) = ^ = \. Ve assume that V'\s & reflecting boundary. The cost arising from continuous movement of the controlled process is given by a n ounded piecewise continuous function C^ix) = ^C,/?, (1 — r,(x)) defined on [—1, V]. If /=i le trajectory is in position x at time t, then there arises a cost of the magnitude Cr(x)A/ + (A/) in the time interval (/, t + Ar). We want to find a control V* ^ M which minimizes the long run average cost per unit me. THE OPTIMAL POLICY In this section we show that the optimal output policy has the following form i) r(z) = (0.0.0, . (1.0.0. . (1,1.0, . (1,1,1,0, 0,0) 0,0) 0,0) . , 0) if z = if < z < -y, if -y, < z ^ 72 if 72 < z ^ -y3 (1,1,1, ... , 1,1) ifz > y„_, 582 D. ZUCKERMAN where ^ y, < 72 ^ ••• < y.-i < V- (Recall that Ci ^ C2 ^ . . . ^ C„ by assumption). In the special case in which C, = C2 = . . . = C„, as one might anticipate, the optimal control would specify the maximum discharge rate at all positive levels, i.e., y* = y* = ... = y*n-i = 0. First consider the subclass of policies L <Z M under which, at any given state, the demand associated with the /"^ purpose (/ = 1, 2, ... ,«), will be satisfied totally by the reser- voir, or totally by the supplementary source. It can easily be seen that L is the collection of all piecewise constant functions r() on [—1, V] with range in S C 5, where S = {{a\, aj, ■ ■ ■ ,a„) | a, = or 1, for / = 1, 2, ... ,«}. Thus under an output policy F € L, the set (0, V] can be decomposed into a finite number of intervals, on each of which F is a constant. That is, for each output policy F € L, there exists a real sequence {y^K'Lo such that = 70 < 71 < 72 < • • • < T/, = ^' where F() is constant over each interval (7,-1, 7,) for i = \, 2, . . . , p. For each policy F € L, we define the following sets: (4) A^Y) = [j\\ ^ j ^ n. F/x) = 1 for x € (7,-1, 7,)}. (1 < ' < p) ■ The theory of optimal control in diffusion processes (Theorem 5 of Mandl [6], p. 168) implies that under a given policy F € A/, the long run average cost per unit time, 0r, is the unique number to which there exists a continuous function H'p() on [— 1,K] such that ^^^ ^^, ^ + by{z) wr(z) + Cr(z) - 4>v = ^ k dz P holds for every z € (—1, V) which is a point of continuity of F, and that ^^^ wr(0) = «^r - L RjCj. 7=1 (7) Wr(F)=0. Let a,(F) = 1 - X Rj 7e'4,(r) and ^,(F) = £ R,Cj. J i A,(T) The general solution of the differential equation (5), assuming a,(F) ^^ for a policy F € L is given by 0r-Cr(z) , e'"'''' . 0r - ^,(0 , ^"''^'^ . (8) "^^^^ = bAz) ^ "M^ '• = a,(F) -^ ^7n" ^" for z € (7,-1, 7,), (/ = 1, 2, ... , p), where d, are arbitrary constants. The solution for the case a,(F) = can be obtained by considering the limiting behaviour of (8) as a,(r) approaches 0. I -1 DIFFUSION MODEL FOR CONTROL 583 Recalling that wpC) is a continuous function over [— 1,K], the constants d, must be hosen to assure continuity of WpC) at the points y, (/ = 1, 2, . . . , p—l). Thus we obtain the Dllowing equations: 9) a,(r) a,(r) ' a, + ,(r) a,+,(r) '+' jr / = 1, 2, . . . , p - 1. The optimal control r*will be determined with the aid of the follow- ig Theorem. THEOREM 1: A control r*is optimal if and only if 10) r;(x) = /u>o) if C, + wr.(x) ^ if C, + wr.(x) < )r / = 1, 2, . . . , n, and for every x which is a continuity point of F* where wp. is the solu- on of the differential equation (5) when policy r*is employed, and /(£> is the indicator func- on of the event E. PROOF: Let 9y-^(x) = b^ix) wr(x) + Q(x), for F € M and t/» € M ccording to Theorem 6 of Mandl ([6], p. 168), F*is optimal if and only if .1) Or-r-ix) = min {flp-^Cx)} ir every x € (0, V] which is a continuity point of F *. (Note that o^,(x) = 1 for every position € (0, V], under any output policy i// € A/). For a given policy t)j € M we have er',^(x) = (1 - il /?,•/*, (x)) H-r-U) + 2^ (1 - ip,(x)) C,R, i=\ 1=1 f ^ = wr^ix) + £ C;R, - £ R,ilj,{x) [w^.U) + C,]. then clearly follows that 0p.p.(x) = min {0r«j,(x)) for every x € (0, K] which is a con- [luity point of F* if and only if (10) holds for / = 1, 2, . . . , n. This concludes the proof. D S Generally, there is no guarantee that an admissible optimal control will exist. However, i our case, it follows from Theorem 1 that if an optimal output policy F * € A/ exists, then ]' € L. But the existence of an optimal control in the subclass L of policies follows directly hm Theorem 4.1 of Pliska [7]. Thus there exists an optimal control in M We proceed with \i following proposition. PROPOSITION 1: Let F * be the optimal output policy, then wp. is nondecreasing over t; interval [0,V]. J PROOF: We will summarize briefly the main steps of the proof. Clearly <;■• — ^ R,Ci ^0. Using equations (6) and (7), we obtain the following inequality '}) H'F.(O) ^ Wp. iV) =0. il 584 D. ZUCKERMAN From (13) it can be seen by elementary analysis that if Wp. is not nondecreasing, then there exist two points x and y in the open interval (0, V), which are continuity points of F* such that (14) and (15a) (15b) dz dwx"{z) dz < 0, > 0. Using equations (10) and (14) we obtain that F*(x) = F *(>'). Hence, (16) by,{x) = by^iy) and (17) Cr.(x) = Cy^iy). From (5) we obtain (18) dw^'iz) dz + 6r.(x)H'r«(x) - </)r. + C^'{x) = dw^'iz) dz \z=y + bi-'iy)wy'(y) - <!>[•' + Cy-iy), now substituting equations (14), (16) and (17) into (18) we have f dWr'iz) dz dwi-'iz) dz which is a contradiction to (15a) and (15b). Hence wp. is nondecreasing. ■ D Recalling that C^ ^ Cj ^ ■■ ^ C„, and by using proposition 1 and Theorem 1, it fol- lows that if F* ix) = 1 for a given y, then F*(z) = 1 for z such that x ^ z ^ Kand for each / such that 1 ^ / ^ J. Now in order to establish that the optimal output policy has the form given in (3), we still have to show that r*(x) = 1 for every positive x. But this is a direct consequence of the following Proposition. PROPOSITION 2: The optimal output policy F* satisfies the following condition wr«(0) + Ci ^ 0. PROOF: The proof will be done by contradiction. Suppose that (19) Wr.(O) + C'l = - 8 < Since Wp. is continuous on [0,V] it follows that there exists e(8) > such that H'p.(>') + C, < for < >' =^ 6(8). Now recalling that Wp. (•) satisfies (10), it follows that F,*(.y) = for < j ^ e (8) and for / = 1, 2, .... /2. Using equations (6) and (8) we have wr-iy) = </>r- - L C,R, rorO ^ y^ e(S). (=1 DIFFUSION MODEL FOR CONTROL 585 nee wp.(O) = wp* (e(8)), we can repeat the same argument over the intervals [e(8), e(8)] ... and therefore T Xy) = for / = 1, 2, . . . , « and for every position j^ € [0,V]. lus r*is the trivial policy that keeps the output rate constantly at zero. But (7) must hold, so = £ C,R, and w^-iO) = 0. But C, > 0, so (=1 0) Wr.(O) + C, = 0r- - Z ^'^' + C, > 0, lich is a contradiction to (19). Therefore >vr.(0) + C, ^ required. D This concludes the proof that the optimal output policy is of the form given in (3), as sired. In the following example, we will illustrate how to determine the optimal control values ',72* y:-i. EXAMPLE: Let us consider the following case: We have a finite dam with capacity of = 100. There are two types of demand for water, where /?, = 0.9, /?2 = 0.2, C, = KC, (K ^ 1), C2 = C. st note that for a given policy F € £ which has the form of (3), wp is given by (see (8)) 1) Wr = 10,^ - 2C + lOe-o-V. ,^^ ^ ^ ^ ^ ^^ -10^ - 10^0 '^^2 for y, < z < V = 100. ing (6), (7) and (9) we obtain the following equations 1. lO0r - 2C + lOd^ = 0r - Ci0.9K + 0.2) 2. -lO0r- lOf'o^s = 3. lOc^p - 2C + 10e"° '"''^i = - 1O0P - 10/''''^2- From the above 3 equations we obtain that the long run average cost associated with an ):put policy F, which has the form given in (3) will be ;:) _ 2-0.96-'(2-/:) ^ 20 - 108^-"' - 98-' ' ere 8 = e '. From the cost function introduced above it can be seen that only the ratio of Is K = -— is important in order to determine the optimal critical value y * 'I Suppose that A^ = 4, then using (22) one can easily obtain that y f = 55 minimizes 0r iVJect to the following constraint ^ n < V = 100, 0r- ~ 0.1 C 586 D. ZUCKERMAN ACKNOWLEDGMENT The author acknowledges helpful and illuninating discussions with Professor N.U. Prabhu. REFERENCES [1 [2 [3 [4 [5 [6 [7 [8 [9 [10 Bather, J., "A Diffusion Model for the Control of a Dam," Journal of Applied Probability, 5, 55-71 (1968). Faddy, M.J., "Optimal Control of Finite Dams: Continuous Output Procedure," Advances in Applied ProbabiUty, 6, 689-710 (1974). Faddy, M.J., "Optimal control of finite dams: Discrete (2-stage) output procedure," Jour- nal of Applied Probability 77, 111-121 (1974). Hall, W.A., W.S. Butcher and A.M.O. Esogbue, "Optimization of the operation of a multi-purpose reservoir by dynamic programming," Water Resources Research 4, 471- 477 (1968). Haslett, J., "The control of a multi-purpose reservoir," Advances in Applied Probability 8, 592-609 (1976). Mandl, P., Analytical Treatment of One-Dimensional Markov Processes. (Springer- Verlag, New York 1968). Pliska, S.R., "Single person controlled diffusions with discounted costs," Journal of Optimization Theory and Application 72, 248-255 (1973). Pliska, S.R., "A diffusion process model for the optimal operation of a reservoir system," Journal of Applied Probability 12, 859-863 (1975). Prabhu, N.U., "Time-dependent results in storage theory," Journal of Applied Probability 7, 1-46 (1964). Russel, C.B., "An optimal policy for operating a multipurpose reservoir," Operations Research 20, 1181-1189 (1972). ! OMPUTATION TECHNIQUES FOR LARGE SCALE UNDISCOUNTED MARKOV DECISION PROCESSES Thom J. Hodgson and Gary J. Koehler University of Florida Gainesville, Florida ABSTRACT In this paper we consider computation techniques associated with the optim- izalion of large scale Markov decision processes. Markov decision processes and the successive approximation procedure of White are described. Then a procedure for scaling continuous time and renewal processes so that they are amenable to the White procedure is discussed. The effect of the scale factor value on the convergence rate of the procedure and insights into proper scale factor selection are given. iTRODUCTION r One of the most powerful modeling tools for the analysis of controlled probabilistic sys- ms is Markov decision processes. If the system can be structured as a Markov process and e control decisions for the system can be defined in terms of the relevant system costs and lerational characteristics (transition probabilities), then there exists a wealth of theory that can used to find the best (least cost, most profitable) set of decisions for operating the system. i; with many modeling techniques, real probabilistic systems, when modeled as Markov (ocesses, tend to have large numbers of system states. The result is that for many interesting :d important systems, it is necessary to consider the computational aspects associated with per- Irming policy optimization. Many types of nondiscounted Markov decision processes can be transformed to a discrete Ine problem. Such a procedure was explicitly used by Schweitzer [17] for Markov renewal pgrams and involves choosing a parameter, b, for the transformation. As noted by ihweitzer, the value of b influences the asymptotic convergence rate when White's iterative jDcedure [22] is used to solve the transformed Markov decision process. We present theoreti- [ insights into the determination of a b which yields the fastest asymptotic convergence. In lictice, one cannot easily find this optimal b, so we also present heuristic rules for choosing b. Umputational results appear quite promising. kCKGROUND Consider a finite state, discrete time, completely ergodic Markov process which is con- tklled by a decision maker. For each of the A^ states (/), at each transition of the process, the (cision maker chooses an action k = \, .... K,. This action results in transition probabilities I, y = 1, jV, and a reward (cost) qj^. p^i is defined as the probability that the process, now in 587 588 T.J. HODGSON & G.J. KOEHLER State / and under policy k will move to state j over the next transition, q^^ is defined as thi expected reward (cost) over the next transition. The problem is to find the gain optimal actior for each state. Howard [5] showed that for a given policy set, the simultaneous set of linear equations, (1) v,+g = q^+ Zp'v, , = 1 N v/v = could be solved to compute the gain ^of the process. The v/s are the relative rewards (costs! of starting the process in state /. Howard showed that the optimal gain could be obtained usini a simple policy interative algorithm. Consider a finite state, continuous time, completely ergodic Markov decision process. Fo each of the TV states (/), at each transition, the decision maker chooses an action k = \, . . . Kf This action results in a transition rate a,^ and a reward (cost) rate qj". a^j as defined as follows In an increment of time dt, the process, now in state / and under policy /c, will move to state with probability a,^ dt (/ 7^ j). q^ is the expected reward (cost) rate incurred over a residenc in state / using action k. Howard [5] showed that for a given policy set, the set of equations, ^^^ g-q'+ Z«'v,, i = \, ..., N, i otBr could be solved to compute the gain g and a policy iterative algorithm could be used to coi pute the optimal gain. Note that fl„ = - L 0,7' ' = 1' ••• . ^• Finally, consider a finite state, completely ergodic semi-Markov decision process. Thi underlying Markov process has transition probabilities /?,';. The holding (transition) time (m) ii going from state / to j is described by the density function hl^iim), < m < oo. The expectei holding time, given the system starts in state / is ^'= ZA'^X°°An/;,5(m) dm > Jewell [6] showed that for a given policy set, the set of equations, ^^^ V, + T^g = q^+ £a!^v,, / = 1 A^ v^ = could be solved to compute the gain g and a policy iterative algorithm could be used to com pute the optimal gain. WHITE'S METHOD AND PROBLEM TRANSFORMATIONS The bulk of the computational effort in policy iteration lies in solving (recursively) the se of equations (1), (2), (3). For large processes, techniques, such as Gaussian Elimination, LARGE SCALE MARKOV DECISION PROCESSES 589 iickly become untenable. White [22] proposed a successive approximation approach for the ^discounted, discrete time, Markov decision process'. Odoni [13] added bounds for g which e useful in termination decisions. The White-Odoni technique can be summarized as follows: Assume we have computed sets of values V/in — 1), v,(rt — 1), / = 1, ... , A^ and a jantity ^„_i. We then compute a new set ?'+ i/^'v/ (« -1) / K,(rt) = max / = 1 v,(n) = V,in) - g„, L"(n) =max [V,(n) -w,(n - 1)) L'(rt) = min [V,(n) - v,(« - 1)) here M is a state of the process such that for all sets of policies and some integer w > 0, the obability of reaching state M in u transitions, starting in any state /, is nonzero for all states /. hite showed that the repeated application of equations (4) will converge' to a solution for luations (1). Odoni showed that L"{n) ^ L"in + \) ^ g ^ L'{n + 1) ^ L'{n). \ In practice. White's algorithm has proven to be very effective for large scale systems. It is jible, self-correcting, and, lends itself to the exploitation of any supersparsity [7]. i While straight forward application of White's approach does not, in general, work for con- luous time, and semi-Markov processes, these processes can be transformed to a form com- tible with White's approach. Consider equations (2) with v, added to both sides of the equa- m. {) y,+g = q!" + Z a!^, y, + (1 + a,f)v„ / = 1 N, I v^ = bting the definition of a,,, then if '» > a„ = - £ o,5 > -1, i = \. ... , N. iuation (5) is of the same form as equation (1). Substituting 1 -I- a,^ for o,^ in the rate matrix, i: new matrix [ajj] has the properties of a stochastic matrix. If (6) holds, it follows that White's method can be used to solve the continuous time hrkov decision process. The following procedure can be used to convert a continuous time pblem to satisfy (6). ,j 1 Let an,a, = max \a^\ i = \ A' * = l K 'he assumptions used by White can be relaxed. Schweitzer 119] proved convergence tor the general single chain acy- c process while Su and Deininger |20] extended this to the periodic case. Such conditions are hard to test in practice. Rently Platzman Il4] has given a weaker condition that can be readily tested. Finally, Morton and Wecker [11] have gjiJralized most of the above plus have added some new dimensions to the algorithm. 590 T.J. HODGSON & G.J. KOEHLER 2. Divide ail a I", and ^, , I, j = \, .... N, k = \, ... , K, hy b > a^iax- Condition (6) i: now satisfied. 3. Using the new a^ and qj', solve the problem using White's method. 4. To express the results in terms of the original process, multiply the gain g by b. Th{ optimal policy and relative rewards (costs), v,, obtained are valid for the orginal process, I Note that the scaling really amounts to changing the time frame of the problem. Consider the reorganization of equations (3). (7) q!" , ^ a'v, , (p!:-\) 7'k '•^ "rk 'rk ' Letting a'^i = P^JT'^. and it is readily seen that equations (7) are of the same form as equations (2). As a consequent the transformation can also be applied to semi-Markov decision processes (the transformation i equivalent to Schweitzer's [17]). Note that a Markov process is itself a degenerate semi Markov process subject to transformation. One would consider such a transformation if thi convergence properties of White's method could be improved. We now address this issue. L CONVERGENCE FACILITATION There are several procedures that have been used in accelerating convergence in solvini discounted Markov decision processes. By and, large, though, these have not been examinei extensively in the non-discounted Markov decision process context. Briefly, the acceleratioi techniques include problem transformation, [17] cheap iterations [10, 23], suboptimal activit elimination [1, 2, 3, 8, 9, 15, 16, 21] and extrapolation procedures [23]. We will limit our dis cussion here to problem transformation. -I In solving (generalized) discounted Markov decision processes, it is well known that thi largest spectral radius of the transition matrices (i.e., the process spectral radius) governs tb asymptotic convergence rate. Porteus [15], Totten [21] and others have devised problei transformations to reduce the process spectral radius. Morton and Wecker [11] have show: that asymptotic relative values and policy convergence are at least of order (a\)" where X i greater than the subdominant eigenvalue^ and < a < oo is the discount factor. A reasonabi question to ask is whether the choice of b in Step 2 can be made to reduce the modulus of th subdominant eigenvalue of the transition matrix of the optimal policy. } The largest eigenvalue is always 1.0. The subdominant eigenvalue is the remaining eigenvalue having the large' modulus. LARGE SCALE MARKOV DECISION PROCESSES 591 The transition matrix for policy 8 resulting from the scaling procedure is I + — A^, where b A, = P,- I. jt \ and 3c be an eigenvalue and associated eigenvector, respectively, of the starting transition atrix / H Af,. Then \ + b — a, an eigenvalue of I + — A^ with x its associated eigenvector. Now clearly b re\ + b — a. > reX lere reX is the real part of X with -1 < reX < 1 and b > flmax ^ 0- However, it may not be ae that X + b - flmax ^|A|. ppose 8 indexes an optimal policy and X is a subdominant eigenvalue associated with this pol- '. Expanding the square of the modulus of both sides of (9) with X = Xj + X2 / gives that a iuction in the modulus of X requires (1 -X,) X, + b — Qr + b <xl kj = 0, then either Xi = 1 and no reduction can be made or X] < (omax~*)/(^max + *) and ' is necessarily negative. In this case, it would appear that any b > 0^3^ will yield a resultant :nefit in asymptotic convergence. However, this is not necessarily true, since we may "bump" another eigenvalue. That is, increasing b to decrease the absolute value of the dominant sgative) eigenvalue will eventually result in some other (positive) eigenvalue increasing until becomes the new subdominant eigenvalue. At that point further increases in b will not im- :)ve the convergence rate. As an example, consider the Markov process whose transition matrix is given as follows: ".31 .13 .21 .05 .10 .20 .15 .12 .16 .20 .12 .25 .02 .01 .01 .01 .93 .02 .12 .28 .09 .16 .04 .31 .01 .85 .09 .05 .11 .30 .10 .15 .14 .20 The eigenvalues are 1.0, -.8421, .6945, .2079, -.085 +.01161, and -.085 -.01161. It »uld appear that problem transformation should be of value in speeding convergence, since tl subdominant eigenvalue is negative. From the preceding development, it would be £)ected that the convergence rate of the process would be maximized at the value of "b" which r^jUlts in the largest negative eigenvalue being equal to the largest positive eigenvalue. Apply- iif equation (8), to equate the two eigenvalues of the transformed matrix, we get 592 T.J. HODGSON & G.J. KOEHLER 4 (.8421) - ^^ - ^ (.6945) + i^ b b b b Solving, we get b = 1.063. In other words, transforming the process using b = 1.063 should achieve the "best" asymptotic convergence for the process. As a test, White's Algorithm was run using costs of q = (1.14, 2.27, 5.06, 2.97, 3.96, 4.90) (only one policy per state). The problem was declared "solved" when L"(n) — L'{n) ^ 10~^ Runs were made for various values of b (see Figure 1). The actual minimum number of itera- tions (30) occurred for a value of Z? = 1.09, whereas the number of iterations for b = 1.063 was slightly higher (31). The inaccuracy in prediction is expected, since the method of predic- tion considers only main effects and ignores the contribution of the smaller eigenvalues. ^ 60- 50- NUMBER OF ITERATIONS TO CONVERGENCE 40- 30 Vv 1.0 1.1 1.2 Figure i As one might expect, the straightforward application of the above observations is not practical, since the determination of eigenvalues for large processes is itself difficult. However, in practice it is usually intuitively obvious to the analyst that a process may possess strong cyclic tendencies, indicating that some eigenvalue has a large negative real component. If the cyclic^ tendency is strong enough, this eigenvalue will be the subdominant eigenvalue and the above; LARGE SCALE MARKOV DECISION PROCESSES 593 evelopment suggests that some b > a^ax r^ay decrease the resulting asymptotic convergence ite. In any event, applying White's method, using several values of b marginally larger than i^ax, and noting the convergence rate of the process for various values of b can many times be |f value. In testing the above we noted that if b was made successively slightly larger than Omax' ither the convergence improved dramatically or the convergence slightly deteriorated. To 'irther test this observation, we randomly generated Markov decision problems with the umber of states varying from 3 to 20. Within each state ten different actions were available, /bite's method was used to solve each using b values of bo = Omax + 10"^ Zj, = 1.05* 62 = 1.10*0 *3= 1.15*0 gain, problems were declared "solved" at iteration n when L"{n) — L'{n) < 10^"*. If a prob- •m was solved in fewer iterations for some *, than bj with / > 7, then the problem transforma- on was declared beneficial. Otherwise the transformation was classified as non-beneficial, learly a problem could be mislabeled as non-beneficial using the grid given above but may in ict be beneficial for some * > a max- The opposite is not the case. ' Table 1 gives the total number of iterations to solve the non-beneficial and beneficial roblem cases. A^„ and A^^ stand for the number of problems labelled non-beneficial and eneficial, respectively. If we can assume that the average performance of the set of randomly jnerated problems used in this study is representative of the performance of the set of real Grid problems, then the following observations can be made. First, problems whose conver- ;nce can be improved by increases in * and above a max ^re those problems that are hard to )lve anyway (see Table 1, 19.5 versus 35.4 iterations). Second, when a problem does not low convergence improvement when * is increased above Ornax^ the deterioration in conver- jnce speed is not dramatic (see Table 1, 19.5 versus 22.8 iterations for a 15% increase in * 30ve flmax)- Finally, convergence improvements, when they occur, are rather dramatic (see able 1, 35.4 to 18.5 iterations for a 15% change in * above Omax)- These observations suggest ;iat use of problem transformation can be of significant value in speeding convergence. TABLE 1. Summary of Iteration Counts Total Iteration Counts *o ^ b2 ^3 46 67 896 2372 942 1520 997 1310 1047 1241 Average per Problem 19.5 35.4 20.5 22.7 21.7 19.6 22.8 18.5 BIBLIOGRAPHY j[l] Hastings, N.A.J., "A Test for N"on-Optimal Actions in Undiscounted Finite Markov Deci- sion Chains," Management Science, 23, No. 1, pp. 87-92 (1976). 594 TJ. HODGSON & G.J. KOEHLER [2] Hastings, N.A.J, and J.M.C. Mello, "Erratum Tests for Suboptimal Actions in Discounted Markov Programming," Management Science, 20, No. 17, p. 1143 (1974). [3] Hastings, N.A.J, and J.M.C. Mello, "Tests for Suboptimal Actions in Discounted Markov Programming," Management Science, 79, No. 9, pp. 1019-1022 (1973). [4] Hordijk, A. and H. Tijms, "The Method of Successive Approximations and Markovian Decision Problems," Operations Research, 22, pp. 519-521 (1974). [5] Howard, R.A., Dynamic Programming and Markov Processes (MIT Press and Wiley, New York, 1960). [6] Jewell, W.S., "Markov Renewal Programming: I and II," Operations Research Society oi\ America, Jl, pp. 938-971 (Nov. -Dec, 1963). [7] Kalan, J.E., "Aspects of Large-Scale, In-Core Linear Programming," Proceedings of ACM Annual Conference, Chicago, Illinois (August 3-5, 1971). [8] MacQueen, J., "A Modified Dynamic Programming Method for Markovian Decision Prob- lems," Journal of Mathematical Analysis and Applications, J4, pp. 38-43 (1966). [9] MacQueen, J.B., "A Test for Suboptimal Actions in Markovian Decision Problems," Operations Research, 15, pp. 559-561 (1967). [10] Morton, T.E., "On the Asymptotic Convergence Rate of Cost Differences for Markovian Decision Processes," Operations Research, 79, pp. 244-248 (1971). [11] Morton, T.E. and W.E. Wecker, "Discounting, Ergodicity, and Convergence for Markov Decision Processes," Management Science, 23, pp. 890-900 (1977). [12] Nering, E.D., Linear Algebra and Matrix Theory, (2nd. Ed., John Wiley and Sons, New York, 1970). [13] Odoni, A.R., "On Finding the Maximal Gain for Markov Decision Processes," Operations Research, 77, pp. 857-860 (1969). [14] Platzman, L., "Improved Conditions for Convergence in Undiscounted Markov Renewal Programming," Operations Research, 25, No. 3, pp. 529-533 (1977). [15] Porteus, E.L., "Bounds and Transformations for Discounted Finite Markov Decision Chains," Operations Research, 23, No. 4, pp. 761-784 (1975). [16] Porteus, E.L., "Some Bounds for Discounted Sequential Decision Processes," Management Science, 18, No. 1, pp. 7-11 (1971). [17] Schweitzer, P.J., "Iterative Solution of the Functional Equations of Undiscounted Markov Renewal Programming," Journal of Mathematical Analysis and Applications, 34, pp. 495-501 (1971). [18] Schweitzer, P. J., "Multiple Policy Improvements in Undiscounted Markov Renewal Pro- grams," Operations Research Society of America, 79, pp. 784-793 (May-June, 1971). [19] Schweitzer, P.J., "Perturbation Theory and Markovian Decision Processes," MIT Opera- tions Research Technical Report, 75, (June 1965). [20] Su, S.Y. and R.A. Deininger, "Generalization of White's Method of Successive Approxi- mations to Periodic Markovian Decision Processes," Operations Research, 20, No. 2, pp. 318-326 (1972). [21] Totten, J.C., "Computational Methods for Finite State Finite Valued Markovian Decision Problems," Operations Research Center, University of California, Berkeley, ORC-71 (1971). [22] White, D.J., "Dynamic Programming, Markov Chains, and the Method of Successive Approximations," Journal of Mathematical Analysis and Applications, 6, pp. 373-376 (1963). [23] Zaldivar, M. and T.J. Hodgson, "Rapid Convergence Techniques for Markov Decision Processes," Decision Sciences, 6, pp. 14-24 (1975). AN ALGORITHM (GIPC2) FOR SOLVING INTEGER PROGRAMMING PROBLEMS WITH SEPARABLE NONLINEAR OBJECTIVE FUNCTIONS Claude Dennis Pegden The Pennsylvania State University University Park, Pennsylvania Clifford C. Petersen Purdue University W. Lafayette, Indiana ABSTRACT This paper presents an algorithm for solving the integer programming prob- lem possessing a separable nonlinear objective function subject to linear con- straints. The method is based on a generalization of the Balas implicit enumeration scheme. Computational experience is given for a set of seventeen linear and seventeen nonlinear test problems. The results indicate that the al- gorithm can solve the nonlinear integer programming problem in roughly the equivalent time required to solve the linear integer programming problem of similar size with existing algorithms. Although the algorithm is specifically designed to solve the nonlinear problem, the results indicate that the algorithm compares favorably with the Branch and Bound algorithm in the solution of linear integer programming problems. INTRODUCTION This paper presents an algorithm for solving the following nonlinear pure integer pro- jamming problem. NNS NS Max g{x) = Y. fi^^i^ + Z ^/•^/■ P S.t. Ax $ b 1 i/here: c. A, and b denote the usual constant arrays /"*" denotes the set of all nonnegative integers ^ denotes constraints of the less-than-or equal type and greater-than-or-equal type fj{x) is a single variable nonlinear function with /,(0) = The region defined by Ax "^ b, x^ I* is bounded and nonempty A^A^S denotes the number of nonlinear stages NS denotes the total number of stages There are several transformations which are useful to convert problems to the required ;brm. If the problem contains k equality constraints of the form a,x = 6,, we can replace this 3t by a set of /c + 1 inequalities of the form a^x ^ b, for i = \, . . . , k and 595 596 CD. PEGDEN & C.C. PETERSEN k (=1 k X ^ ^ b,. If the problem contains one or more nonlinear functions fjixj) such that (=1 fj(0) ^ 0, we can replace each by a new function fjixj) = fjixj) — fjiO). If the nonlinear por- tion of the objective function cannot be separated into functions of a single variable, but the nonseparable portion can be separated into k functions of linear integer combinations of the variables, we can convert the problem to the required form by replacing each of the k linear combinations in the objective function by a dummy variable d,^. The dummy variables are forced to assume the appropriate values by appending, for each k, the constraint that d^ equals the /c"^ linear combination. To illustrate, consider the following example: Max (x, + 3x2)^ - 9x2^ (2) ^^ S.i. 2xi + X2 ^ 5 Xi and X2€/^ To convert the problem to the desired form we must express the objective function as the sum of nonlinear functions of a single variable. We accomplish this by replacing the linear combina- tion Xi + 3x2 in the objective function by the dummy variable d^ and append the constraint that di = Xi + 3x2, yielding the following equivalent problem: Max d^ - 9xj S.t. 2x, + X2 < 5 d\ — X\ — 3x2 = d\, X| Xi^ I* If the objective function contains a product term of two variables, we can employ the device of completing the square to transform the problem to the desired form. To illustrate, consider the following nonlinear integer programming problem: Max X]^ + 6x1X2 (4) ^ ' S.t. 2x, + X2 < 5 Xx, X26/+ At first glance, because of the product term 6x1X2, the objective function appears to be non- separable. However, by adding and subtracting 9x2^ to the objective function we complete the square, and by factoring the objective function becomes: (x, + 3x2)^ - 9X2^ The problem is now identical to the previous example and is convertible to the desired form by the introduction of a dummy variable. The problem given by (1) is very difficult to solve with existing methods. If the problem contains only one or two constraints, it may be amenable to solution by dynamic programming. If all fj are nondecreasing functions and c and A are nonnegative, then the imbedded state space approach presented by Morin and Marsten [7,8] may be employed to help mitigate the "curse of dimensionality" normally encountered in problems having several constraints. Also, if the problem is of very small size and can be converted (using the binary expansion) to a zero- one polynomial problem [11], a solution may be obtainable using either the transformation of Watters [13], or a zero-one polynomial algorithm such as that given by Taha [12]. However, many nonlinear integer programming problems of both practical and theoretical significance fall ALGORITHM FOR INTEGER PROGRAMMING 597 ito neither class and are therefore essentially unsolvable by methods other than the GIPC2 Igorithm presented here. i I. OVERVIEW OF THE ALGORITHM I The GIPC2 algorithm is based upon the notion that although the solution space of an iteger programming problem may be large, it is finite. The general approach of the algorithm ; to implicitly enumerate, by means of a fathoming test, a set of candidate solutions to the roblem. The set of candidate solutions is defined in such a way that it necessarily contains the ptimai solution to the problem. The general phases of the algorithm are as follows: I. Find a good feasible solution to the problem. II. Determine a vector of upper and lower bounds on x III. Generate a set of candidate solutions to the problem. This set should be as small as possible, while necessarily containing the optimal solution. IV. Implicitly search the set of candidate solutions for the optimal solution to the prob- lem. Note that in developing an implicit enumeration algorithm for the zero-one integer pro- ramming problem, the special structure of the problem can be exploited to eliminate Phases II nd III. The set of candidate solutions can be simply defined as the set of vectors produced by II combinations of assignments of zero and one to each variable in the problem. The major isk of the Balas algorithm [1] consists of essentially Phase IV; implicit enumeration of candi- late solutions. However in the nonlinear integer programming problem given by (1), our task '> more difficult. If we define the set of candidate solutions as simply all combinations of feasi- le integer values assigned to each variable, the number of candidate solutions can become so irge, for some problems, as to make the approach computationally intractable. The key to the uccess of the GIPC2 algorithm, therefore, is the ability of the procedure to limit the set of andidate solutions to a manageable size, while guaranteeing that the optimal solutions is con- iined within the set. The steps of the algorithm require the solution of several linear programming problems or obtaining bounds on the optimum nonlinear solution. Our approach to solving the non- near problem consists essentially of substituting one of three different linear approximating unctions for the nonlinear objective function at each step in the algorithm where a linear pro- ramming solution is required. The three linear approximating functions are defined as follows: CqX : a "good" linear approximating function to the nonlinear objective function. The linear function c^xdoes not necessarily boand the nonlinear function above or below. C/X -I- a/: a "good" lower bounding linear approximating function to the nonlinear objective function. For all x in the domain, C/X + a, < ^(x) c„x + a„ : a "good" upper bounding linear approximating function to the nonlinear objective function. For all x in the domain, c„x + a„ ^ g{x) 598 CD. PEGDEN & C.C. PETERSEN All the linear programming solutions in the algorithm are used exclusively to obtain either a feasible solution or an upper or lower bound on the optimal solution. By appropriately selecting our linear approximating function for each linear programming problem we set bounds that nar- row the range of search. Since the feasible region is not altered, the algorithm guarantees an exact solution to the nonlinear integer programming problem. A number of excellent methods exist for computing a linear approximation to a separable nonlinear function. These include a least squares fit procedure and a linear programming for- mulation to minimize either the sum of the absolute values or the maximum deviation. Geoffrion [3] discusses the use of objective function approximations in mathematical program- ming and presents methods for determining the "best" approximation for cases of particular interest in mathematical programming. GIPC2 employs a less elegant approximating procedure, but because of its simplicity and the nature of the problem, the procedure is well suited for this , (, particular application. It should be noted that although the specific approximation employed j <{[; will not affect the accuracy of the final solution obtained by the GIPC2 algorithm, poor approxi- mations will have the consequence of increasing the computation time and storage require- ments of the algorithm. 3. FINDING A GOOD FEASIBLE SOLUTION (PHASE I) j„ Our objective is to compute SMIN, a lower bound on the optimal objective function value, by finding a good feasible solution to the problem. We accomplish that by the following steps: 1. Replace g(x) by c„x and solve the resulting linear programming problem. 2. Force the continuous solution to a good feasible integer point, x, by successively test- ing each fractional variable at its rounded down and rounded up value, then fixing the variable at the integer point associated with the largest feasible value of the objec- tive function. 3. Compute SMIN by substituting x^ into the nonlinear objective function. If step 2 fails to yield a feasible integer solution, SMIN is set to a large negative number or may be specified on data input if a lower bound is known. 4. COMPUTING UPPER AND LOWER VARIABLE BOUNDS (PHASE II) Once SMIN has been obtained, we next establish good upper and lower bounds on the variables so that the range of search may be narrowed. We do this by solving two linear pro- gramming problems of the form maximize x, and minimize Xj, subject to the original con- straints of the problem and the additional constraint that the objective function be greater than or equal to SMIN. For the nonlinear objective, this procedure would produce a nonlinear con- straint which we desire to avoid. By noting that since <:„x + a,, ^ gix), then c„x + a„ ^ SMIN, we will replace the nonlinear constraint ^(x) ^ SMIN with a series of linear constraints which conservatively approximate the single nonlinear constraint. The pro- cedure for computing upper and lower bounds on x with a nonlinear objective function is as fol- lows for each variable in the problem: 1. Determine initial variable bounds by solving the following two linear programming problems: Max ^1 Min ^1 S.t. Ax ^ b S.t. Ax ^ b c„x ^ SMIN - - oc,, c„x > SMIN - - a ALGORITHM FOR INTEGER PROGRAMMING 599 2. Compute c„x + a„ based on the current variable bounds. 3. Using the new c„x + a,,, generate a new constraint c,;X ^ SMIN — a„ and append it to both Hnear programming problems. 4. Solve the two linear programming problems to determine new variable bounds. If the variable bounds have been improved as shown by a reduction in domain, a stronger "cut" may be possible, so go to step 2. Otherwise terminate the procedure and use the current variable bounds, UB, and LB,. The procedure will obviously terminate at some point with no improvement; possibly with bounds that uniquely determine the value of some of all variables. Although the number of iterations required is problem dependent, the procedure typically converges within two or three iterations. 5. GENERATING THE SET OF CANDIDATE SOLUTIONS (PHASE III) In Phase III we enumerate in an efficient manner solutions that yield a value equal to or greater than SMIN, possibly including some solutions that are infeasible. In Phase IV we will dentify the optimal (feasible) solution. i It is convenient to transform each domain LB, to f/5, found in Phase II into a domain iUBi — LB,). This is done by defining a new vector y as y = x — LB and replacing x in our original problem with LB + y. Also, if the lower bound and upper bound are equal for any Variable y, we can delete y, from the problem as we know its optimal value is zero (the optimal value of X, = LB, = UBi). Our problem now becomes: I Max Giy) + giLB) f' S.t. Ay $ b vhere Giy) = giy + LB) — giLB) where the bar over the constant array b denotes Tiodification of the original values due to the substitution y = x — LB. From Phases I and II ve also know: ;6) vhere Giy) ^ SMIN ^ y ^ UB SMIN the new lower bound on the objective function after the transformation from X to >- is SMIN = SMIN - ^(Ifi) UB is the vector of upper bounds on y and is equal to UB — LB. Our procedure for generating the set of candidate solutions to problem (5) consists of •inumerating all y vectors satisfying the conditions given by (6) and the constraint set Ay ^ b. >Iote that the optimal y vector will satisfy all the above conditions and therefore will necessarily »e contained within the setof candidate solutions. In order to facilitate the computations, we elax the condition Ay ^ b at several points in the procedure. As a result of this relaxation. 600 CD. PEGDEN & C.C. PETERSEN the set of candidate solutions which is generated may contain entries which are not feasible to our problem. This relaxation allows us to accomplish the enumeration of y vectors in a recur- sive tabular fashion akin to the procedure employed in discrete dynamic programming. How- ever, Bellman's "Principle of Optimality" is never invoked in the process and, therefore, the assumption of monotonicity is not required in the development. Because of the similarities between discrete dynamic programming and the recursive tabular procedure employed here for enumerating the y-vectors, it will aid our discussion to borrow the following dynamic program- ming terminology. ^ STAGE: a function of a single variable STATE: the state at stage k is the value of G{y) resulting from an assignment of integer values to y^. at stage k through y„ at the last stage, inclusive DECISION: a positive integer assignment to an element of >' i NDEC: the number of decisions made at a given stage-stage Our general procedure is to generate a table for each stage containing all potentially optimal states and the corresponding decisions at that stage which, in conjunction with prior decisions, produce that state value. By the means of certain tests, we exclude a large number of entries from the tables by ascertaining that they are either infeasible or nonoptimal to our problem. Assignments which the tests fail to exclude, and are thus contained in the tables are termed as candidate solutions to our problem. The computations begin at the last stage (n) and recursively proceed to stage 1. The stage A? computations are performed as a special case, with the computations for stages « — I, rt — 2, — k, — , 1 proceeding as the general case. Therefore details of the stage n and stage k computations are sufficient to fully describe the algorithmic procedure for generating the set of candidate solutions to the problem. Stage n Computations In the stage n computations we simply enumerate in tabular form all possible state values for the integer domain oi' y„. The following table is produced. STAGE n STATE NDEC DECISIONS 1 Gil) 1 1 Gil) 1 2 1 • 3 GiUB,,) UB„ ALGORITHM F^iR INTEGER PROGRAMMING 601 Stage k Computations The general stage computations begin by forming and initializing two vectors, named rVEC and LVEC. The first records the total state value for each possible decision, and the second records location information relative to the previously generated stage. These vectors ire used simply to produce efficiently the STATE, NDEC, DECISIONS table for each state. The vectors TVEC and LVEC are initially dimensioned equal to the number of possible decisions at stage k. The d^'^ entry_ in TVEC and LVEC corresponds to the decision y,^ = d, jvith d initially ranging from to UBi^. However, we will show that as the state value at a given jtage increases, the number of possible decisions at that stage decreases. We will take advan- age of this property to continuously reduce the dimension of TVEC and LVEC as the compu- ,ations for stage k proceed. Each entry in TVEC, corresponding to a given decision t/ assigned to yi^, is the total state /alue at stage k. The total state value for each decision is comprised of a fixed state contribu- fion at stage k combined with the /'^ total state at stage k + \, where / is given in the corresponding location in LVEC. Defining 5/^+i , as the /"^ state value in the stage k + \ table, ill possible total state values at stage k resulting from y^ = d are given by: 1:7) t,{i. k) =gkid + LB,) - g.UB,) + S,+i,, I iVhere / is defined from 1 to the number of state values in the stage k + \ table and g/^ denotes he objective function for the /c'*^ stage. Note that by systematically indexing (7) over all / for •-ach entry in TVEC we can generate in ascending order of magnitude all possible state values ,nd the corresponding decisions for stage k. This recursive relationship, in conjunction with wo exclusion tests, is the basis for generating the set of candidate solutions to the problem. ^he purpose of the two exclusion tests is to exclude as many states and corresponding decisions s possible from the stage k table by discerning that they are either infeasible or nonoptimal to he problem. i ^'xclusion Test A |j At each stage /c, solve the following LP. n F = min £ c,jyj + a,j ^^ S.t. c^ ^ SMIN - a^ Ay ^b y >0 •here C/, c^, and a/ and a^ are the constants from the previously defined linear bounding func- ons and the subscript J is used to denote the f^ stage. Exclude all state values for which /(/, k) < V, and revise the lower bound LBi^, accordingly. The optimal integer solution Y* of le rt- variable problem will be a point within the feasible region given in (8). It will have a cate value at stage k, as given by the objective of (8), when only its Y* for j = k, + 1, .... n are considered. J^is a lower bound on the minimum state value at stage k con- idering all of the points in the feasible region. It follows that y*, J = k, k + \ n, the ptimal decisions at stages k through «, will yield a state value ^ Kand will not be excluded by iscard of all state values < V. 602 CD. PEGDEN & C.C. PETERSEN Exclusion Test B At each stage k, solve the following two LP's, one with yi, fixed at its current lower bound and the other with y/^ fixed at its current upper bound. w = max J=k Z = max S.t. c^ > SMIN - a^ Ay $b yk = LB, S.t. c^ ^ SMIN - Ay ^b yk = UB, (9) (10) where: LBj, denotes the current lower bound on yj^ (initially 0) UBi^ denotes the current upper bound on >'^. (initially UBj^ — LBj^). Exclusion test B is dynamic in the sense that the bounds on y^^ are continuously tightened as larger and larger states values are generated by the algorithm. This tightening of bounds is accomplished as follows: (a) if the current state > W ox if the problem is infeasible replace LBi^ by LBi, + 1 and compute the new value of W (b) if the current state > Z or if the problem is infeasible replace UBj^ by UB^^ — 1 and compute the new value of Z Recall the general process of generating candidate solutions using the TVEC and LVEC vectors. At stage k candidate values {d) are assigned to y,, starting with the current LBi^. A state value n Z Gjiyj) will result. However, if it exceeds W, an upper bound on the maximum feasible state value with yi^ equal to LB,, or if there is no feasible solution to problem (9), then LB, is clearly not a valid assignment. The lower bound may be increased by one, to seek a feasible solution and/or a new increased value of W. Based on similar reasoning, the current upper bound UB, may be tightened, by reducing it by one, whenever the state values exceeds Z, an upper bound on the maximum feasible state value with y, equal to UB,, or whenever there is no feasible solution to problem (10). The proof that the optimal state value and corresponding decision y* will not be excluded follows from the fact that initially LB, ^"^y* ^ UB,. As larger and larger state values are generated, the bounds on y, will tighten until the upper and lower bounds are equal to y*. The bounds cannot be tightened to exclude y* since y* is feasi- ble to the constraint set given in (9) and (10) therefore the corresponding state must be less than or equal to Jf'and Z. It should be noted that although we must recompute the values of W ox Z when we tighten the upper or lower bound on _v^, there is no need to solve the entire LP given in (9) or (10) again. Since we are only changing one of the right hand side constants of the original LP, we can make use of the basis inverse to update the final tableau and employ the dual simplex algorithm when necessary to regain feasibility. The step-by-step procedure for generating the stage k table is as follows: 1. Compute the lower bound Kas given by (8). ALGORITHM FOR INTEGER PROGRAMMING 603 2. Form and initialize the vectors TVEC and LVEC where the d^^ entry in TVEC is given by td = gk(d + LB,,) - gkUB,,) + 5^+,,, where d initiallji ranges from to UB^ and where / is chosen such that t^ is the minimum value greater than or equal to V. (Exclusion Test A.) The value of / is recorded as the d^^ entry in LVEC. 3. Compute upper bounds H^ and Z as defined by (9) and (10) and use them to elim- inate any infeasible decisions. 4. Flag all entries having the smallest state value in TVEC. 5. If the smallest state value is less than or equal to both Wand_Z, go to step 6. Other- wise, apply Exclusion Test B to tighten bounds on >';. If LBi^ > UBi, the stage k table is complete. Otherwise go to step 4. 6. Enter values for the STATE, NDEC, and DECISIONS as one row of the stage k table. 7. Update the flagged entries in the TVEC to the next largest possible state value for that decision by increasing / by 1 and update LVEC accordingly. If the d^'" entry in TVEC is flagged, the updated TVEC it J and LVEC (/,/) are given by ^ '(/ + "^/i+i.i+i ~ '^/t+i./ id = id + 1 Go to step 4. Example To illustrate the computations in generating the set of candidate solutions, consider the following problem: Enumerate: >'i + 3^2 + 2^3 ^ 20 ^j, < 5 ^ y2 ^ 6 where the constraint set Ay ^ b is: < J'3 ^ 6 J'l +>'2 +^3 ^ 8 yei+ The computations follow the step-by-step procedure outlined above (beginning at stage 3) and produce the following tables: STAGE 3 STAGE 2 STAGE 1 STATE NDEC DECISIONS STATE NDEC DECISIONS STATE NDEC DECISIONS 1 18 2 4,6 20 3 0, 1, 2 2 1 1 19 1 5 21 2 0, 1 4 1 2 20 2 4, 6 22 2 0, 1 6 1 3 21 1 5 8 ! 4 22 1 6 10 1 5 12 1 6 604 CD. PEGDEN & C.C. PETERSEN Candidate solutions are recovered from the tables by tracking through the tables begin- ning at Stage 1 and working towards the last stage. This tracking process can be thought of as generating a combinatorial "tree" of solutions for a specified starting or "goal" state. The nodes of the tree correspond to a given stage and state, and the branches emanating from the node correspond to the alternate decisions for that stage and state. A path through the tree represents an assignment of integer values to each stage of the problem. For example, with a state value of 22 we have two candidate solution^ y^ = 0, y2 == 6, y^ = 2 and ^'i = 1, J2 = 6, >'3 = 3, the latter being non-feasible to the Ay ^ b constraint. 6. IMPLICIT ENUMERATION OF CANDIDATE SOLUTIONS (PHASE IV) In generating the set of candidate solutions, we have excluded only state values and assignments which could be shown to be infeasible or non-optimal to our problem. Therefore the optimal feasible solution to our problem is necessarily contained within the set of feasible and possibly infeasible solutions. Our strategy is to search for a solution within the set which is feasible with respect to the constraints of our problem. To guarantee that the first feasible solu- tion found is also optimal, the search is performed starting with the largest state value at Stage 1 and working towards the smallest state value at Stage 1. For a given goal state, the number of candidate solutions is simply the number of paths emanating from the corresponding state of Stage 1. For simple trees explicit enumeration, by substituting each candidate solution into the constraint set of the problem and testing for feasi- bility, is quite practical. Due to the combinatorial nature of the tree, this approach can become computationally overburdening for larger problems. We will therefore employ a method for implicity evaluating candidate solutions. Through the application of a fathoming test, large por- tions of the combinatorial tree will be exempted from enumeration. The implicit evaluation procedure starts by selecting the largest state value at Stage 1; this is termed the present goal and there will be a corresponding tree. The examination of paths through the tree is performed by the systematic assignment of values to the >^-variables at each stage, starting at the first stage by assigning a value to >']. A partial evaluation at Stage j is defined as the assignment of integer values from the first stage node through the f^ stage node, inclusive. The state contribution resulting from this partial evaluation is designated ZINT. All paths through the tree will yield the present goal, but not all paths will yield ^'-values that satisfy the constraints of the original problem. Our purpose is to devise a test to detect as early as possible in the search if a particular branch (and its sub-branches) cannot yield a candidate solution feasible to the constraints of the original problem. We accomplish this by comparing the goal state value to the sum of ZINT and the continuous maximal solution (ZCONT) for the ^-variables not yet assigned values. Since ZCONT is greater than or equal to any feasible integer completion for the unassigned y- variables, if the sum of ZINT plus ZCONT is less than the present goal state, we can exclude all integer completions of this partial evaluation from consideration. At this point the branch is said to be "fathomed" and we "backtrack", that is, go back to the preceding node and evaluate the remaining branches emanating from it. If the present goal is achieved, by completing a path through the tree without violating any constraints of the original problem, the current assignment to y is added to the lower bound vector (LB) of x and the algorithm terminates. If the present goal is not achieved, that is, if no feasible path through the tree exists, then the next largest state value at Stage I is used as the goal state and its applicable tree is searched. The algorithm will terminate because the range of state values generated is bounded such that it included at least one feasible j'-vector. ALGORITHM FOR INTEGER PROGRAMMING 605 The implicit evaluation method described above requires a computationally efficient pro- cedure for computing ZINT + ZCONT at each partial evaluation. One method would be to compute ZCONT by solving for the unassigned >'-variables as a linear programming problem. For many problems this would necessitate solving a large number of linear programming prob- lems.- However, by viewing each assignment to j as a change to the right hand side of the con- tinuous LP, ZINT + ZCONT can be conveniently computed using sensitivity analysis. 7. COMPUTATIONAL EXPERIENCE The performance of an integer programming algorithm is measured by its ability to solve a wide class of integer programming problems within reasonable computer time and storage limi- tations. To provide a basis for comparison with existing algorithms, the GIPC2 procedure was programmed in ANSI FORTRAN and implemented on the Purdue CDC 6500 computing sys- tem. Instructions for its use and a FORTRAN listing of the program are provided in reference [9]. The GIPC2 algorithm was evaluated on a set of seventeen linear and seventeen nonlinear test problems. Although the algorithm was specifically designed to solve the nonlinear integer programming problem, we were interested in evaluating the performance of the algorithm on linear problems as a special case. To provide a basis for comparing with existing linear integer programming algorithms, the seventeen linear test problems were also solved using a Branch and Bound code. The main difficulty in comparing the computational efficiency of different integer pro- gramming algorithms is in developing a representative test problem set containing problems of varying size and difficulty. It is important to note that the relative performance of two integer programming algorithms may be highly dependent upon the test problem set used. In addition, it should be noted that problem size is only one factor in determining problem difficulty, and ;this factor is often dominated by problem structure. A problem with only five variables can be ;5ignificantly more difficult to solve than a problem with twenty-five or more variables. I The set of linear test problems used in this investigation include four problems containing ive variables each developed by Haldi [4], and thirteen additional problems of larger size. The our problems of Haldi, despite their small size, are difficult problems to solve and have been ised extensively as a test bed for integer programming algorithms. Problem number five is a system design problem given by Petersen [10] and contains fourteen integer variables. The emaining twelve problems were randomly generated and range in size from ten variables to wenty-five variables and differ widely in their difficulty to solve. None of the test problems lave explicit upper bounded variables. The Branch and Bound code used in the investigation is the MINT mixed integer pro- ;ramming algorithm [6] based on the BBMIP code developed by the IBM Corporation [5] for he IBM 360 models 25 and above. The program is written in FORTRAN and is based upon ihe Dakin improved procedure of Land and Doig. A more modern code such as MPSX- 4IP/360 or UMPIRE was not available on the Purdue CDC system, or it would have been jsed for a more meaningful comparison. i The computation times for the seventeen linear test problems are presented in Table 1. Ml times are in seconds and are for the Purdue CDC 6500 computing system. Times given in he table that are preceded by a greater-than sign indicate that the respective algorithm ter- ■ninated without an optimal solution .established after that amount of computation time. 606 CD. PEGDEN & C.C. PETERSEN TABLE 1. Computational Experience — Linear Problem Number Number of Constraints Number of Variables Computation Time (sees) GIPC2 MINT 1 4 5 .545 4.099 2 4 5 .400 2.972 3 6 5 .608 3.375 4 6 5 .434 3.457 5 8 "14 7.453 36.297 6 5 10 .811 20.221 7 5 10 1.042 21.001 8 10 10 .804 .537 9 10 10 .888 1.488 10 5 20 4.195 >188. 11 5 20 3.803 30.250 12 10 20 10.261 32.882 13 10 20 10.430 3.549 14 5 25 7.422 >188. 15 5 25 5.610 30.758 16 10 25 21.545 32.364 17 10 25 64.497* 5.440 'Reduced to 13.3 seconds by reordering variables in ascending order of ifieir domain (see suggested modification in Section 8 below). The GIPC2 code clearly outperformed the MINT code in solving the test problems of Haldi. Note that the MINT code required more time to solve problem 1 of Haldi containing only five variables than it required to solve problem 13 containing twenty variables. The test problems of Haldi clearly illustrate that problem structure can be more significant in determin- ing problem difficulty than problem size. In test problems 5 through 17, neither algorithm computationally dominates the other. The results for test problems 5, 6, 7, 9, 1 1, 12, 15, and 16 tend to indicate that the GIPC2 code is an average of 8.7 times faster than the MINT code, and the performance in problems 10 and 14 shows GIPC2 vastly superior. However this conclusion is contradicted by the results of test problems 8, 13, and 17 where the MINT code is 2.3 times faster than the GIPC2 code. The performance of the GIPC2 and MINT codes on these problems illustrates the unpredictable performance that is associated with integer programming algorithms. A significant point of superiority of the GIPC2 code, however, is illustrated by compara- tive results on problems 10 and 14. Although the Branch and Bound procedure has been employed successfully to solve a number of large problems [2], it is sometimes misled into tak- ing the wrong path early in the search. As a consequence, the Branch and Bound procedure can require an excessive amount of computer time to solve even relatively small problems. The MINT code failed to solve problems 10 and 14 after 188 seconds of computation. The computational results to date tend to indicate that the GIPC2 algorithm is less susceptible to getting sidetracked with large running times. The performance of GIPC2 algorithm in solving integer programming problems with separable nonlinear objective functions was investigated by solving a set of seventeen nonlinear test problems. The nonlinear test problems were generated by using the constant arrays from the linear problem set with five or more of the linear terms in the objective function being replaced by nonlinear terms. Problems 18 through 21 each contain five variables and were ALGORITHM FOR INTEGER PROGRAMMING 607 constructed from the problems of Haldi by replacing the linear objective function with five non- linear stages. Problem 22 is a nonlinear version of the system design problem given by Peter- sen. The remaining twelve problems each contain from twenty-five to fifty variables and either five, fifteen, twenty, twenty-five, forty, or fifty nonlinear stages. Computation times for the seventeen nonlinear test problems are given in Table 2. The computation times compare favorably with computation times for linear problems of similar size. Note that test problems 32 and 34, containing forty and fifty nonlinear stages respectively and ten constraints, each solved in less than twenty seconds. The data tends to suggest that the integer programming problem with nonlinear objective function is of relatively the same difficulty for the GIPC2 algorithm as the linear integer programming problem. The ability of the GIPC2 algorithm to solve the integer programming problem containing a separable non- linear objective function in roughly equivalent times to that required to solve the linear integer programming problem is one of the primary contributions of the research reported here. TABLE 2. Computational Experience — Nonlinear Problem Number Number of Constraints Number of Variables Number of Nonlinear Stages GIPC2 Computation Time (sees) 18 4 5 5 .202 19 4 5 5 .212 20 6 5 5 .203 21 6 5 5 .215 22 8 14 12 10.312 23 10 25 5 4.905 24 10 25 5 15.494 25 10 25 15 7.550 26 10 25 15 20.204 27 10 25 20 6.998 28 10 25 25 5.522 29 10 25 25 7.299 30 10 25 25 9.756 31 10 40 25 12.493 32 10 40 40 11.892 33 10 50 25 17.596 34 10 50 50 18.717 The application of the nonlinear capability of the GIPC2 algorithm to a practical problem is illustrated by test problem 22, the nonlinear version of problem 5. In the original problem ■ the system maintenance and operating costs which are to be minimized were assumed to be a ilinear function of the number of system components by type. However, in many systems the maintenance and operating costs are a nonlinear function of the number of system components iby type. The restrictive linearity assumption is imposed primarily as a consequence of the lack of practical algorithms for solving the nonlinear integer programming problem. However the .GIPC2 algorithm solved the more descriptive nonlinear version of the systems design problem fin 10.312 seconds as compared to 36.297 seconds required by the MINT code to solve the [linear version of the problem. A major difficulty encountered in evaluating GIPC2 for solving nonlinear integer program- tming problems is in verifying that, the solutions obtained are indeed optimal. The nonlinear 608 CD. PEGDEN & C C. PETERSEN test problems are difficult problems to solve and alternate methods of solution apparently do not exist. To verify the accuracy of the GIPC2 algorithm in solving the nonlinear integer pro- gramming problem, a relatively simple ten-variable, five-constraint, nonlinear integer program- ming problem was exhaustively enumerated. The enumeration required approximately thirty- five minutes of computation time on the Purdue CDC 6500. The problem was solved by the GIPC2 algorithm yielding the same solution in approximately two seconds. 8. CONCLUSIONS * The generalized implicit enumeration scheme described in this paper can solve both linear and nonlinear integer programming problems. Computational experience indicates that the presence of nonlinearities has little or no affect on the computational efficiency of the algo- rithm. This attribute of the GIPC2 algorithm should allow for the formulation and solution of integer programming problems which fully consider the economies to scale which exist in the world. A modification which would facilitate the use of the GIPC2 algorithm in solving larger problems is the replacement of the present simplex subroutine with a revised simplex method possessing implicit upper bounding procedures for the variables. This would allow the initial data matrix of the problem to be stored in external storage and would also avoid the need for inclusion of explicit upper bound constraints on the variables. This last feature would be partic- ularly useful in solving zero-one integer programming problems. A simple modification to the GIPC2 code that would result in considerably reduced com- putation time consists of incorporating a scheme for automatically reordering the variables in ascending magnitude of their domain prior to generating the set of candidate solutions. As a consequence of this reordering, the trees of candidate solutions would tend to be sparse near the top (Stage 1). As a result the number of partial evaluations examined would be reduced. The effect of this modification is illustrated by test problem 17 which originally required 64.5 seconds to solve. After manually reordering the variables in ascending order of their domain, the problem was solved in 13.3 seconds. REFERENCES [1] Balas, E., "An Additive Algorithm for Solving Linear Programs with Zero-One Variables," Operations Research, Vol. 13, pp. 517-546 (1965). [2] Forrest, J.J.H., Hirst, J.P.H. and Tomlin, J. A., "Practical Solution of Large Mixed Integer Programming Problems with UMPIRE," Management Science, 20, No. 5, pp. 736-773 (1974). [3] Geoffrion, A., "Objective Function Approximations in Mathematical Programming," Dis- cussion Paper No. 61, Management Science Study Center, University of California, LA (May 1976). [4] Haldi, J., "25 Integer Programming Test Problems," Working Paper No. 43, Graduate School of Business," Stanford University (December 1964). [5] IBM Catalog of Programs for IBM System 360 Models 25 and Above, GC 20-1619-8, Pro- gram Number 360D-15.2.005. [6] Kuester, J. and Mize, J., Optimization Techniques with FORTRAN iMcGraw-HiW, 1973). [7] Marsten, R. and Morin, T., "A Hybrid Approach to Discrete Mathematical Programming," Sloan School of Management, Working Paper 838-76 (March 1976). [8] Morin, T. and Marsten, R., "An Algorithm for Nonlinear Knapsack Problems," Manage- ment Science, Vol. 22, No. 10 (1976). ALGORITHM FOR INTEGER PROGRAMMING 609 [9] Pegden, CD., "An Implicit Enumeration Algorithm for Solving Integer Programming Problems with Linear or Nonlinear Objective Functions," Ph.D. Dissertation, Purdue University (August 1975). [10] Petersen, C.C, Systems Planning and Evaluation Techniques, Textbook in preparation. [11] Plane, D.R. and C. McMillan, Discrete Optimization (Prentice-Hall, Inc., New Jersey, 1971). [12] Taha, H., "A Balasian-Based Algorithm for Zero-One Polynomial Programming," Manage- ment Science, Vol. 18, No. 6 (1972). [13] Watters, L.J., "Reduction of Integer Polynomial Programming Problems to Zero-One Linear Programming Problems," Operations Research 15, 1171-1174 (1967). DUALITY FOR QUASI-CONCAVE PROGRAMS WITH APPLICATION TO ECONOMICS T. R. Jefferson, G. M. Folic, and C. H. Scott University of New South Wales Kensington, N.S. W., Australia ABSTRACT A duality theory is developed for matliematica! programs with strictly quasi-concave objective functions to be maximized over a convex set. This work broadens the duality theory of Rockafeliar and Peterson from concave (convex) functions to quasi-concave (quasi-convex) functions. The theory is closely related to the utility theory in economics. An example from economic planning is examined and the solution to the dual program is shown to have the properties normally associated with market prices. 1. INTRODUCTION Although duality theory for linear programming has been well developed and widely used for some time, it is only in recent years that significant advances have been made in duality theory for convex (concave) programs. Notable contributions have been made by Rockafeliar [13] and Peterson [12]. Despite these developments, there are still many programming prob- lems that are not encompassed by the existing theoretical developments. One such important class of mathematical programs are quasi-concave programs, and it is the purpose of this paper to extend the benefits of duality theory to this class of programs. In 1967, Arrow and Enthoven [1] developed necessary and sufficient conditions for the optimality of quasi-concave programs. Later Luenberger [11] developed a duality theory for quasi-concave programs, which separated primal and dual variables. This duality theory was expanded by Greenberg and Pierskalla into surrogate duality [5], [7]. The duality theory developed here is valid for quasi-concave programs with closed strictly quasi-concave objective functions. This work is motivated by the dual relationship between goods and prices first observed by Roy [14] and by Peterson's work in duality theory [12]. The major result of this paper, lies in the separation of the objective function from a linear con- straint set, which simplifies the derivation of the dual program, as well as the relationship between the primal and dual programs. Furthermore, the duality theory developed here is widely applicable to a class of problems found in economics. Duality theory comes naturally to linear programs via Kuhn-Tucker theory and the linear- ity of the problem. In general, the existence of non-linearities in mathematical programs raises a number of problems and makes generalizations more complex and difficult. Wolfe duality is an example of this problem. For concave programs, the concave conjugate transform can be 611 612 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT used to derive a dual program, and to develop all the associated primal-dual relationships (Peterson [12]). In order to clarify the difference between the duality theory developed in this paper, and that currently used for concave programs, as well as why a special duality theory is needed, the following brief digression will be made. Consider the following definitions. DEFINITION: The concave conjugate transform of a function gix) defined on a set Cis the pair [//, D] defined by h(y) A inf {< x. y > - gix)} ~ ,v€C D A {^^ I inf < X, >' > — gix) > — °°). = vec DEFINITION: The hypograph of a function gix) is the set: {ix. fi)\x^C, f3 ^ gix)]. DEFINITION: A concave (quasi-concave) function is closed when its hypograph is a closed set. DEFINITION: The supergradient of a function g at a point x is the set dgix) defined by dgix) = {y I gix) + < y, z - X > ^ giz). V z e C). By construction //(>') is a closed concave function. In addition for x € C and j € Z) we |l have the following inequality: ' 0) gix) + hiy) ^ < X. y > The inequality (I) is an equality when X e d hiy) OT y ^ d gix). The concave conjugate transform generates very strong relationships. If gix) is not con- cave, the concave conjugate transform still operates on gix) as if it were concave. The conju- gate transform of a non-concave function, gix), does not use the hypograph of gix), but only the convex hull of the hypograph of gix). Thus some information regarding the properties of gix) is lost by this transform. This undesirable feature of the conjugate transform, indicates the need to develop a new transform, which can be used to derive a duality theory for non-concave programs. At first, it may not appear to be a particularly serious limitation that the conjugate transform cannot be used with non-concave functions. However there are cases where a transform that will handle quasi-concave programs is required. For instance, the conjugate transform cannot be used to derive dual programs for problems in economic theory. The rea- son is that economic theorists have, over the years, reduced the restrictiveness of both consu- mer and producer theory. The fundamental property of the utility function in the theory of consumer behaviour, which stems from the axioms of weak preference ordering, is that indifference curves, or constant utility surfaces define convex sets. Equivalently, economists DUALITY FOR QUASI-CONCAVE PROGRAMS 613 require these indifference curves to have the property of diminishing marginal rates of substitu- tion (Green [4]). Thus the minimal property of any utiUty function, used to represent consu- mer choice behaviour, is quasi-concavity. Thus in order to derive any dual programs for economic problems, it is essential that the transform used to obtain the dual is valid for quasi- concave programs. The transform derived here has these desired properties, and will be called the utility transform. The properties of the utility transform are derived in the next section and are then used to develop a duality theory for quasi-concave programming. This theory is an extension of the duality theory developed by Luenberger [11]. An example from economics is presented at the end of the paper in order to elucidate the usefulness of the utility transform. 2. THE UTILITY TRANSFORM A form of duality between prices and commodities in consumer theory was originally noted by Roy [14] in 1947. Roy's work explicitly developed a dual relationship between the consumer's direct utility function, which is a function of his commodity bundle, and an indirect utility function, which is a function of the prices of the commodities and consumer income. Recently this theory has been used to provide a clearer understanding of consumer theory by proving, in a simple manner, a large number of propositions in consumer theory. Lau [10] and Diewert [3] provide a useful compendium of this work. Although the concept of the utility transform stems from the relationship between the direct and indirect utility functions, the pur- poses of this paper require the development of a slightly different approach to duality. Consider the pair [u, U], of utility function u{x) defined on the convex set U. DEFINITION: The utility transform of [u, U] is a pair [v, V] defined by ! v(p) A inf {- u{x)\ < p. X > ^ 0} V ^ {plM [- uix)\ < p, X > ^0] > - oo}. — x€U ! By construction, for x ^ U, p ^ V and < ;?, x > ^ we have the utility inequality holding: uix) + vip) < 0. I ! The construction of v(p) and the indirect utility function differ in the following ways. Firstly the linking constraint < p. X > ^ is generally referred to as the budget constraint, but usually in consumer theory, it has a posi- tive right hand side. However, as will be seen later, it is convenient to absorb the right-hand side into the inner product. This can be done by identifying consumer income as another com- modity, which although the consumer has an endowment of it, does not wish to have it for its i own sake. A more general interpretation, is to consider x to measure the quantity of goods and [Services that a consumer buys and sells. We use this second approach in the example (section 1,4) • Finally we take the infimum of — uix) rather than the supremum of u{x). These differences allow us to work with quasi-concave functions, vip) is quasi-concave, whereas the indirect utility function is quasi-convex. More importantly though, the absorption of the right- hand side of the budget constraint permits a complete separation of the price and commodity 1 variables. 614 T.R. JEFFERSON, G.M. FOLIE & C.H, SCOTT It is accepted that the utility transform is not the only method for handling duality. Greenberg and Pierskalla [5] developed a surrogate dual, which in terms of the notation used here, is defined by in inf {- u{x) I Tp,{g,{x) - b,) < 0} where g/ix) ^ b, is a constraint to be satisfied. Clearly, when g,ix) — b, is replaced by x,, then the above expression is the same as the utility inequality. The surrogate dual was further specialized for quasi-concave functions in [7] by Green- berg and Pierskalla to the z-quasi-conjugate ' u*ip) = z + inf {— u{x) \ < X, p > ^ z]. This becomes the utility transform when z = 0. In their paper on "Quasi-Conjugate Func- tions and Surrogate Duality" [7], Greenberg and Pierskalla develop the properties of the z- quasi-conjugate. This paper formed the basis of a further analysis by Crouzeix into the proper- ties of quasi-concave functions [2]. The z variable in the z-quasi-conjugate is difficult to handle in the dual. In order to take cone conditions into consideration, it is necessary to have < x, p > ^ 0. This means that z = 0, and we are left with the more convenient utility transform. We now develop the properties of [v, V]. In order to do this we require a relaxation of the concept of supergradient. DEFINITION: The local supergradient of a quasi-concave function w(x), x € (/ is a set 6'°^w(x) defined by: ^loc / \ A f I 1- u{x + a Ax) - u{x) . ^ . ^WA b w(x) A [p I lim < < Ax, /7 > V Ax such that X -h /3 Ax € U. ji > 0]. The local supergradient is a generalization of the concept of supergradient presented ear- lier. For the usual case of differentiable functions, the local supergradient contains a single ele- ment: the gradient. It is through the local supergradient that [w, U] and [v, V\ can be related. It is necessary to know this relationship in order to develop the duality theory presented in the next section. The relationship between [u, U] and [v, V] is formally expressed by Theorem 1, which is stated at the end of this section and proved in the Appendix. In order to prove Theorem 1, the following four Lemmas are needed. LEMMA 1: \{p) is quasi-concave and positively homogeneous of degree zero. K is a convex cone. Q'"*"" \{p) is positively homogeneous of degree minus one. PROOF: See appendix. LEMMA 2: For v, the utility transform of u, hypo v is closed if hypo u is closed. PROOF: See appendix or [7]. DUALITY FOR QUASI-CONCAVE PROGRAMS 615 LEMMA 3: For u closed, p e K, x € t/ and < /?, x > ^ we have the utility inequal- ity ^2) u{x) +yip) ^ 0. When equality holds in (2) we have: (i) \p e 9'°^«(x), X ^ (ii) fjLX € d^°'w(p), At ^ 0. PROOF: See appendix. LEMMA 4: Suppose w is a closed strictly quasi-concave function on U with utility transform [v, V]. Then either (i) The maximum of u is attained at a point z ^ U and \{p) = — u(z) for p ^ V O {p \ < p, z > ^ 0}. Let ;; be equivalent to qip = q) if there exists a > such that p = aq. \{p) is a strictly quasi-concave function on V r\ [p \ < p, z > ^ Q] with respect to the quo- tient space defined by this equivalence relation, or (ii) The supremum of u{x) is infinity. \{p) is a strictly quasi-concave function on V with respect to the quotient space defined in (/). See appendix for proof. THEOREM 1: Let [u, U] have the following properties: (i) the hypograph of u is closed. (ii) w(x) is strictly quasi-concave. (iii) [v, V] is the utility transform of {u, U]. (iv) z € (/is the optimal point of sup [u{x)] if it exists. Given that x € U, p ^ V and < p, x > ^ 0, x ^ z then: w(x) -I- v(p) = if and only if either (I) \^ € 9'°^«(x), X > or (II) (a) p ^ S = V, or V n {p \ < p. z > ^ 0) if z exists (b) p X e a'°'v(^), I' > (c) < p, X > = (d) u (x) = sup { w (>») I >' = X or y = - x} y € U See the appendix for proof of theorem. 616 T.R. JEFFERSON, G.M FOLIE & C.H. SCOTT 3. DUALITY THEORY Consider the following quasi-concave program. PROGRAM A max. u{x) subject to X 6 U f) X where w is a closed strictly quasi-concave function defined on the convex set U, and x is a con- vex cone. We assume that if z is such that u{z) = sup u{x) then z i x^ together with (/ n X 5^ and sup [u{x) \x € U C) x) < °°- The economic dual to Program A is: PROGRAM B max. v(p) subject to p € V n n where [v, V] is the utility transform of [u, U], and n is the dual cone to x defined by, n = [p\< p, X > ^0, \f X € x)- At optimality the following relations hold: x€Unx, P^Vf^u < p, X > = uix) + vip) = Kp 6 d^°'u{x). \ > p X ^ a'°W(^), i^ > u ix) = sup {w (j') I >» = X OT y = — x). An interpretation of these optimality conditions will be given, when the example is discussed in the following section. THEOREM 2: The previously mentioned optimality conditions are necessary and sufficient for optimality for Programs A and B. 4. EXAMPLE 1 The theory of economic planning is concerned with devising an allocation of resources j( which maximizes a given social welfare function of the society in question, given that the society has a prescribed quantity of endowments of labour, together with some consumption i goods remaining from the previous period. The society has available a set of known production j technologies, which take inputs, such as labour, and transform them into consumer goods. The use of some goods as both production inputs and consumer goods is not precluded. Problems of this type are discussed by Heal [8]. Thus in a directive economy, the central planning office must eff"ectively solve a large mathematical programming problem. In order to illustrate the duality concepts developed ear- lier, consider a simplified version of the planning problem which contains all the essential ele- ments of economic planning. The problem is defined as follows: DUALITY FOR QUASI-CONCAVE PROGRAMS 617 max w(x + e) ^ w(x; e) subject to X ^ — e X = Az; z ^ c x' [ere, superscripts denote vectors and subscripts denote scalars. where x = X .1 wix; e) is a known quasi-concave social welfare function which captures the preference rderings that this society has for its consumption of goods, x'' + e^\ and the use of leisure, ' + e'. The society has an endowment of leisure, e' which it can forego, in quantities x' to rovide labour for productive activities to produce more consumer goods, x', which increase jgregate social welfare. It should be noted that the nature of the coefficients of the production ;tivity matrix. A, will ensure that only positive quantities of consumer goods will be produced, nd that leisure will be consumed by production, x' ^ 0; i.e. only positive quantities of labour ill be supplied. This problem will now be expressed in terms of the format developed in Section 3 in rder to illustrate the relationship between the primal and dual programs. This will be done for )cial welfare functions which are assumed to be log-linear. The problem can now be expressed I a form similar to that referred to as PROGRAM A in the previous section: max wix; e) = ^ a, log(x, + e) subject to X € U D x where f/ = {x I x, > — e,, V/) ! i and, X = {^ I -^ = ^z; z ^ 0}. (early x is a cone, which is defined by the technological possibilities available to the economy, nus, this particular formulation of the primal problem treats production through the cone, hich, as will be shown below, results in a considerable simplification. To obtain PROGRAM B, the economic dual, it is necessary to derive the utility transform w{x\ e): inf {- w(x; e) | < p, x > < 0}. xeU ;)rming the Lagrangian, and then differentiating with respect to x, and X, the following results iie obtained: a,ix, + e,)-^ -\p, = V/ I < ;?, X > = 0. From simple algebra, it can now be shown that 618 and T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT e, + Yfi^, Z«- V/. Before continuing, the discussion can be simplified, by assuming that the a, are selected! so the ]^a, = 1, since all the expressions derived clearly imply that the a, are normalized. Substituting for x,, the utility transform \{p\ e) of w(x; e) is obtained: v(/?; e) = ^a, log V =[p\p,^Q, V/} 1 p, «, Z Pi ^i 1 Thus, the economic dual to the original problem can now be written in the form of PRO- GRAM B given in the previous section: max ^ a, log 1 ;', subject to /7 e K n n where V = [p\ p, ^ Q, V/} and n = {p M' /7 ^ 0) I where 11 is the polar cone of x ^^^ '^s derivation follows from the well known properties or finite cones. It is of some interest to provide an interpretation of this dual program. The most impor- tant property that emerges is that the dual program is expressed solely in terms of the dual vari- ables, p, while retaining all the basic parameters that defined the primal program. It needs little appeal to one's intuition to interpret the dual variables, p, as some type of price vector. Unfor- tunately this, in itself, does not provide much insight, since the dual programs to various prob- lems, for example, linear programming and posynomial programming, generate dual variable; with quite diff'erent interpretations. For dual programs derived by use of the utility transform, the key lies in the requiremen that at optimality < p, x > =0. The optimizing process implicit in the utility transform i^ identical to the well known optimizing problem in economics in which a consumer selects hii most preferred commodity bundle, subject to the restriction that his expenditure must no exceed his income. Thus, it seems plausible to interpret the dual variables, /?, as market prices; A careful examination of, not only the dual program, but also the relationships between the pri mal and dual variables at optimality, indicates that this is a valid and useful interpretation. As a consequence of the utility transform [v, V], the dual variables, p, must be non negative, which is a requirement for a sensible price system. The dual variables, p, are related to the primal variables, x, by the relationship bw{x\ e) kp, = dx, DUALITY FOR QUASI-CONCAVE PROGRAMS 619 rhe Lagrange multiplier, \, from the utility transform appears, because even if the optimal olution to the primal problem, x, is known the magnitudes of the resulting prices depend on he units of measurement used in w(x\ e). This relationship indicates that in order to have an absolute measure of the dual variables, }, when the solution, jc, to the primal problem is known, it is necessary to know, \, which rises in the utility transform. By examination of the utility transform, it is clear that the .agrange multiplier, X, can be interpreted as the marginal utility of society's endowment, and ts magnitude clearly depends on the units in which the social welfare, w, is expressed. In the ilanning problem considered here, it can easily be seen that \ = i/Y,Pi^n which is the recipro- al of the value of society's endowment. Furthermore, as A. is neither a primal nor a dual vari- ble, it is a linking variable. As indicated earlier, consumer preferences for different commo- lity bundles can be expressed as an ordinal function, which need only be quasi-concave. If it vere accepted that utility could be measured absolutely, and was not merely an ordering con- ept, then one could assume a concave utility function and then use a duality theory based on he conjugate transform. Similar comments are valid for the social welfare function, w, which 5 also ordinal. If commodity / = 1, is designated as the numeraire good, then a set of relative prices can e used to define the optimality condition _A dwjx; e) / bw{x\ e) w. P\ dx, I bxx ' "hus, at optimality, the familiar result emerges; namely that the relative price of good / (in jrms of good 1) is equal to the marginal rate of substitution of good / for good 1, MRS,,. A lose examination of the dual social welfare function, v^p\ e) indicates that it is similar to the idirect utility referred to earlier. The difference here is that the dual social welfare function as the term, J^ pjCj, which is clearly the market value of society's endowment given a price ector p. This is similar to the notion of income, which is used in conventional consumer leory. As an aside, it should be noted that the indirect utility function as used by Lau [10] in Dnsumer theory can be expressed in the form of \(p\, e) if consumer income, Y, is assumed to e the only endowment, with a price of 1, and all other goods are purchased by the sale of the ■■adowment (income). I Keeping this discussion in mind, and accepting the interpretation of the dual variables, at 3timality, as market prices, it is now possible to provide an insight into another optimality con- tion: f X € V v{p; e). It can be readily shown, t- = ^P/e,. This optimality condition results !a set of relationships similar to the demand equations found in consumer theory. For the anning problem being analysed here, this condition tells us that if an optimal solution to the jal is known, p, then the quantity of consumer goods that will be produced, x', and the nount of leisure foregone, x', to produce these goods is obtained by this differential. The condition of primal and dual feasibility, together with the requirement that the link- g condition, < p, x > = 0, holds at optimality, can be used to develop some insights into e production sector. imal feasibility: x e U n x '^ x '^ - e and x = Az; z ^0. ual feasibility: p e V n U =^ p ^ and A'p ^ 0. 620 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT The linking condition, < p, x > = 0, is interpreted as a trade balance constraint which ensures that the value of the goods produced is equal to the value of the compensation paid for the inputs used to manufacture these goods. By direct substitution, < p, x > = < p, Az > = < A'p, F > = 0. From the primal and dual feasibility conditions, z ^ and p ^ 0, this means that if' < A'p, z > = 0, then each term of this scalar expression must be zero. This is only possible when for each production activity J : if Y.'^'iP' < 0, then z, = / or if Y, ^ijPi ^ ^' ^^^^ ^/ ^ ^• Thus by use of the optimality conditions, derived in section 3, the familiar complementary slackness conditions emerge. These can be given the usual economic interpretation. If a partic- ular activity J is included in the optimal plan, then Zj > and thus ^a,jPi = 0, which means that there are no excess profits and the value of the labour (foregone leisure) used in this par- ticular production activity is equal to the value of the output. On the other hand if a particular activity is uneconomic, Ij = 0, then the value of the labour services used to operate the activity at unit level is greater than the returns from the products produced by the activity. Again, the results that emerge are the standard ones encountered in the economic theory of market behaviour. Thus it can be seen that the planning problem formulated at the beginning of this section can be viewed in a completely different manner. An alternative solution to the original plan- ning problem, which required the planners to issue the directives, 3c, to everyone in society, would be to solve the dual program to obtain a set of prices, p, which can be interpreted as market prices. Then the planners need only announce these prices, p, and the members of society, by responding to these price signals will generate an allocation of resources equivalent to that which would have occurred under a directive, x. ACKNOWLEDGMENT The authors wish to thank the referee for his helpful comments and for pointing out the: important paper [7]. ' BIBLIOGRAPHY [1] Arrow, K.J. and A.C. Enthoven, "Quasi-Concave Programming," Econometrica, 29, 779-1 800 (1967). [2] Crouzeix, J. P., "Contributions a I'etude des fonctions quasi-convexes," (Ph.D Dissertation, University of Clermont, France, 1977). [3] Diewert, W.E., "Applications of Duality Theory" in Frontiers of Quantitative Economics, Vol| II, M.D. Intriligator and D.A. Kendrick, Editors (American Elsevier Publishing Co..' New York, 1974). [4] Green, H.A.J. , Consumer Theory, (MacMillan, London, 1976). [5] Greenberg, H.J. and W.P. Pierskalla, "Surrogate Mathematical Programming," Operations Research, 18, 924-939 (1970). DUALITY FOR QUASI-CONCAVE PROGRAMS 621 [6] Greenberg, H.J. and W.P. Pierskalla, "A Review of Quasi-Concave Functions," Operations Research, 79, 1533-70 (1971). [7] Greenberg, H.J. and W.P. Pierskalla, "Quasi-Conjugate Functions and Surrogate Duality," Cahiers du Centre d'Etudes de Recherche Operationelle, 75, 437-448 (1973). [8] Heal, G.M., The Theory of Economic Planning, (American Elsevier, New York, 1973). [9] Jefferson, T.R., G.M. Folic and C.H. Scott, "Dual Games," School of Mechanical and Industrial Engineering Report (1977). [10] Lau, L.J., "Duality and the Structure of Utility Functions," Journal of Economic Theory, 7, 374-396 (1969). [11] Luenberger, D.G., "Quasi-Convex programming," SIAM Journal on Applied Mathematics, 76, 1090-1095 (1968). [12] Peterson, E.L., "Geometric Programming," SIAM Review, 18, 1-52 (1976). [13] Rockafellar, R.T., Convex Analysis (Princeton University Press, Princeton, New Jersey, 1970). [14] Roy, R., "La distribution du revenu entre les divers biens," Econometrica, 75, 205-225 (1947). ! APPENDIX PROOF OF LEMMA 1: Consider any two points p\ and pj and < \ < 1. v(Xp, -I- (1 - k)p2) = inf [-u{x) I < \p, + (1 - \) P2, X > ^ 0) ^ min [ inf {- «(x,) | < /?,, x, > ^ 0}, inf [-u{xt) \ < pj, Xj, > < 0}] = min [y(pO. vipi)] This proves the quasi-concavity of v(;7). Consider X > 0, then \(kp) = inf {-u(x) \ < \p, X > ^ 0} x^U = inf {-u(x) I < A ^ > =^ 0} = v(;7). xeu "hus \(p) is positively homogeneous of degree zero, ''is a convex cone by construction. -et X 6 9'°W(-y/7), y > 0. By definition ^x 1- yiyp+aAp) ^ . . . 3) lim - — '— *^- ^ < t^p, X > . a— a ince \{yp) is positively homogeneous of degree zero, (3) implies 4) v(/7 -h a-^) - \{p) lim "^ < < Aa X >. a— a ii" 622 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT Substituting in (3) A*? = — — we obtain y (5) v(p +a^q)- v(p) ^ ^ ^^^ ^^ ^ a Thus by (5) we have proven that x € Q'*"-" vCy;?) it" and only if yx € 8'°'^v(/7). Similar proper- ties are proved for the surrogate dual in [5]. PROOF OF LEMMA 2: By definition, we have that hypo V = [{p, /3) 1/7 6 K /3 < v{p)]. Let {{p,, /3,)} be a convergent series with lim {pi, j8,) = (^, /3) and (/?,, /8,) € hypo v, for all /. We require to show that {p, /3) is a member of hypo v. Assume that (p, /3) ^ hypo v. There are two possibilities to consider: (i) p i V and (ii) /3 > v(^). For case (i), lim /3, = — oo by definition. This contradicts the assumption that {(/?,, /3,)} is convergent. Hence p ^ V. \n case (ii), we let (x,, a,) be such that v(/7,) = — w(x,) = — a,, /3, ^ a, V/. This is admissable by the definition of V. We let lim {x,, a) = {x, a) where {(x,, a,)) and (x, a) belong to hypo w; the latter since hypo u is closed. Hence a = w(Jc) and v(^) = — w(3c). This in turn implies that /3 ^ v(p), which is a contradiction. Hence hypo v is closed. A similar lemma is proved in [7]. PROOF OF LEMMA 3: By the utility transform we have (6) v(;;) = inf [-u{x) | < p, x > ^ 0}. The solution to (6) is the solution to the saddlepoint problem \{p) = inf sup [\ < p, X > — u{x)\. The first order conditions require that \p € a'°^w(x), \ ^ 0. Consider now the transform of \(p) inf sup {/ji < p, X > — v{p)}. The first order conditions require: /x X e 9'°'^v(p), At ^ 0. DUALITY FOR QUASI-CONCAVE PROGRAMS 623 PROOF OF LEMMA 4: Suppose that there exists z € (/such that m(z) = sup [u(x)} is defined xeu \(p) = inf [-u(x) \< p, X > ^0). xeU Thus \(p) = -w(z) for < A z > ^ Consider P\, Pi ^ V r\ [p\ < p, z > ^ Q] such that /?, ^ /72 and ;7i ^ -p2- Choose < X < 1 . v(\;7, + (1 - \)p2) = inf {-«(x) | < \p, + (1 - \)/72. x > < 0} > min [ inf {-w(x,) | < /?,, x, > ^ 0}, inf {-w(x2) | < P2. ^2 > < 0)] = min [v{p\), y{p2)] by the strict quasi-concavity of u. Suppose sup [u{x)] is undefined. 'Consider p^, pj € Ksuch that px ^ pi- Choose < X < 1 v(X/>i + (1 - X)/72) = inf [-u{x) I < X;?, + (1 - \)p2, x > ^ 0} > min [\{p\), v(p2)] by the strict quasi-concavity of u. PROOF OF THEOREM 1: First assume '''> uix) +v(p) =0 (7) implies \p € 9'°'^m(3c) and vx € 9'°'^v(p) by Lemma 3. X and i^ are positive because '(3c), \ip) are strictly quasi-concave. Suppose ^satisfies < ^, z > < 0. Since x ^ z ^^ {;? I < A X > ^ 0, < p, z > > 0} ;^0 (p) is strictly quasi-concave on this set. For p belonging to the set defined by (8), ip) > vip). This contradicts the utility inequality. Therefore (a) must hold. Suppose < p, X > < then by the strict quasi-concavity of w(x) there exists on x such lat < p, X > = and w(x) > w(x). This contradicts the utility inequality. Therefore (c) lust hold. Suppose u (x) < sup {w (>') I >■ = x ov y = —x). 'lis too would contradict the utility inequality. Thus (d) must hold. 624 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT Now going the other way suppose we have kp € 9'°'^w(3c). Consider (9) v(^) = inf [-uix) \< p, X > ^0). By property (i) if the infimum to (9) exists it is attained on f/ n {x | < p, x > < 0}. Because w(x) is strictly quasi-concave a local minimum is a global minimum. Therefore \{p) = -«(x). Suppose now (Ila-d) hold. \{p) is strictly quasi-concave in the sense of Lemma 4 on S. Therefore (b) and Lemma 2 imply j -v(^) = inf [-v{p) I < A ^ > < 0}. Let jc satisfy (d) in addition. Let \p € 9'°'^«(jc), X > 0. 6'°'^w(x) is non-empty because of pro- perty (i). Consider (10) The infimum of (10) is attained at 3c by construction. inf {-u{x) I < A X > ^ 0}. Therefore \{p) = —u{x) and < ^. x > ^ 0. By construction —\{p) < —v(p) if p ^ p. This implies w(x) + v(p) > which contradicts the utility inequality. Therefore v(^) = b(p) and uix) + vip) =0. Note that if x = z and p satisfying < ^, x > ^ and p € V will satisfy the utility in- equality. \ and p are equal to zero and we lose the relationships I and II between the primal and dual variables. The primal problem reduces to one of global minimization which is rela- tively straightforward. PROOF OF THEOREM 2: Assume x is in optimum for Program A. Thus there exists a vector p such that Kp € d'°'^w(x), X > 0. < \p. Ax > < for Ax € x- Since 3c € x, < Xp, 3c > ^ 0. Thus p € n n Kby construction and Theorem 1. Also by Theorem 1 uix) + vip) = 0. Also by Theorem 1 the remaining conditions hold since < p, Ax > < V A x € x- Now suppose the optimum for Program B is /?*€ K fi n such that v(;7*) > vip). This would con- tradict w(3c) -I- \ip*) ^ which is impossible or imply < x, p* > > 0. This too is impossible since 3c € x and p* e n. Therefore p is optimal, for Program B. Suppose now we have p and optimum for Program B. Thus there exists a vector x such that p X € 9'°''v(p) j^ > 0, uix) = sup {«(>') I >- = x or >' = -x) and < x, Ap > < for ^p € n. Since ^ € 11, < x, ^ > ^ 0. By construction and Theorem 1 x e x (^ t/. Also by Theorem 1 ^ DUALITY FOR QUASI-CONCAVE PROGRAMS 625 uix) +vip) =0, and the remaining optimality conditions hold. Suppose the optimum for Program A is x* £ U (1 x such that u(x*) > uix). This would contradict either uix*) + v(^) < which is impossible or imply < x*, ^ > > which contradicts the feasibility of x*and p. The result is proved. ON THE EXISTENCE OF JOINT PRODUCTION FUNCTIONS* Rokaya Al-Ayat Lawrence Livermore Laboratory Livermore, California Rolf Fare Department of Economics Southern Illinois University Carbondale, Illinois ABSTRACT Within a general framewori< of production correspondences satisfying a set of weai< axioms necessary and sufficient conditions for the existence of a joint production function are given. Without enforcing the strong disposabiiity of in- puts or outputs it is shown that a joint production function exists if and only if both input and output correspondences are strictly increasing along rays. Joint production functions are frequently used in economics, however, it was not until Shephard in [6] defined such a notion within the general framework of production correspon- dences that its meaning became clear. The question of existence of these functions, dealt with in this paper, is yet to be settled. On this issue Shephard [8] wrote, "The joint production func- tion is a tricky concept, seemingly simple but not shown to exist except under very restrictive conditions." I For a production technology with strongly disposable inputs and outputs Bol and Moesch- lin [2], showed that continuity of both the input and the output correspondences together with essentiality of all inputs are sufficient for the existence of a joint production function. Later Bol in [1] showed that such a function would also exist if the essentiality condition is replaced by strict increasancy of the output correspondence in all inputs. K '" It is to be recalled that an output correspondence x — P{x) € 2 ^ is a mapping from R '" input vectors x € R" into subsets P (x) € 2 ^ of all output vectors obtainable by x. Inversely to P(x) the input correspondence u — L{u): = {x\u € P{x)] is the set of all input vectors x yielding at least an output vector u. In this paper the existence of a joint production function will be considered under the weak axioms as stated in [7]. Specifically neither the strong dis- posabiiity of inputs or outputs (i.e., x' > x € L{u) =^ x' 6 L{u), u'< u € P{x) =^ n' € P{x) respectively) nor convexity of P{x) or L{u) are enforced. Having strong disposabiiity of •nputs means that if a subvector of inputs is kept constant while the remaining are increased. This research has been partially supported by the Office of Naval Research under Contract N000i4-76-C-0l34 with the University of' California. Reproduction in whole or in part is permitted for any purpose of the United States Govern- ment. 627 i 628 R AL-AYAT & R. FARE output will never decrease implying there can be no congestion in the production system. In addition, strong disposability of outputs excludes their null jointness (see [9]); i.e., each output must be producible when others are not produced. Thus having only weak disposability of inputs (i.e., P{X ■ x) D Pix),\ ^ 1) and outputs (i.e., Li9 ■ u) C L{u). 9 ^ \) allow modelling of both congestion and null jointness. As defined by Shephard [6], the joint production function relates input and output iso- quants to each other. Recall that ISOQ Fix): = {u\u € P{x), 9 ■ u € P{x), 9 > \\, P(x) ^ [0], and ISOQ Liu): = {x\x € L{u),\ ■ x €L{u), X < \], Liu) ^[Q], Liu) 7^ 0. DEFINITION: The function F : IR|" x R^ ^ R+ such that (1) for u" ^ 0, ISOQ Liu") = [xlFiu^.x) =0}, L iu") ^ and (2 ) for x" ^ 0, ISOQ Fix") = {w|f (u.x") = 0), Fix") ^ {0} is a joint production function. An equivalent statement to the definition, to be used in the sequel, was proved by Bel and Moeschlin [2] namely: LEMMA: A joint production function Fiu,x) exists if and only if for all x ^ 0*", Fix) ^ {0} and u ^ 0, Liu) ^ 0, u e ISOQ Fix) <##> x € ISOQ Liu). THEOREM: For all x ^ 0, w ^ such that Fix) 7^ {0}, I(w) ?^ with x — Fix) iu — ' Liu)) satisfying the weak axioms, a necessary and sufficient condition for the existence of a joint production function Fiu,x) is (*) ISOQ Fix) n ISOQ Fix ■ x) = ISOQ Liu) D ISOQ Li9 ■ u) = for all positive scalars X, 9 ^ 1. PROOF: To show the necessity of (*), assume there is a joint production function Fiu.x) and let u € ISOQ Fix) n ISOQ FiX • x). By the lemma, x € ISOQ Liu) and \ • x 6 ISOQ Liu), X ^ \, which is a contradiction. Thus if a joint production exists, ISOQ Fix) D i ISOQ Fix ■ x) is empty for all positive scalers X, X ^ \. A similar argument can be used to ; show that the existence of Fiu.x) implies that for all positive 9, 9 ^ \, ISOQ Liu) fi ISOQ Li9 ■ u) is empty. To show the sufficiency, assume that (*) holds, and that for x > 0, Fix) ^ {0}, u € ISOQ Fix) but X ^ ISOQ Liu). From the definition of the isoquant, there exists a X < 1 such that X ■ X e ISOQ Liu) implying that u e Fix ■ x). But from the weak disposability of inputs Fix ■ x) C Fix) which together with (*) implies that u ^ ISOQ Fix), a contradiction. Similarly it can be shown that having ISOQ Liu) D ISOQ Li9 ■ u) empty would guarantee that X e ISOQ Liu) =^ u ^ ISOQ Fix). Hence the sufficiency of (*) for the existence of a joint production function is proved. See lemma. Q.E.D. ^ 0. JOINT PRODUCTION FUNCTIONS 629 I Continuity of" the production correspondences has not been enforced. However, following an argument similar to that used by Bol and Moeschlin in [2] one can prove: COROLLARY: If a joint production function exists, then both the input and the output correspondences are continuous along rays i.e., Pik" • x) = U P(\ • x) and o<\<\<' L{9° ■ u) = \J L{9 ■ u) respectively, with w, x ?^ 0. e>9'' Note that continuity along rays together with strong disposability imply continuity (see [2] for definition). y Next, consider the production technology; P{x\, xj): = {{(w,, 0)} U {(0, W2))|0 < w, < x,, i = 1,2} and inversely L(wi, wj): = {{(x,, 0)} U {(0. X2)\\x, > u,, i = 1,2). The corresponding isoquants are given by ISOQ Liu^. U2) = {{(x,,0)} U {(0, X2)}|x, = u,, i = 1,2} land ISOQ P(xi, X2) = {{(wi, 0)} U {(0, U2)} I u, = X,, / = 1,2}. In this example, the production correspondence satisfies the weak axioms, but neither strong disposability of inputs and outputs nor the essentiality condition (i.e., P{x) p^ [0] implies (x|, X2) > (0,0)) used in [2] hold. Yet it is clear that a joint production function exists. Finally, an example not satisfying the sufficiency conditions applied in [1] and [2] is given. iBefore introducing it the following proposition to be used, is proved. PROPOSITION: If the production function (/>(x): = max {u\x € Liu)], is continuous and strictly increasing along rays in the input space R|', ISOQ Liu) = {x|0(x) = u), u > 0. " PROOF: Clearly ISOQ Liu) C {x\(f>ix) ^ w}, w > ; let x" € {x|0(x) > u}. Since is continuous along rays, {x|0(\ ■ x") > u] is open implying that x" i ISOQ Liu), hence ISOQ Liu) C {x\<t>ix) = u}. Next assume x"^ ISOQ Liu), u > 0, then since (t> is strictly increasing along rays, if x" € Liu), there is a \ < 1 such that </>(X • x") = w implying that x'>i {x|0(x) = U}. Qgj^ Now, consider the output correspondence x — ► Pix) C [0, + 00), Pix): = {« € /? + : w < <A(x)} A ■ [i\ - 8) (x, - y xj)-" + 8X2^^]-'/'' 0(x): = jfor (xi — y Xj) > otherwise pheTQ the parameters of the WDI — production function 0(x) are A > 0, 8 € (0,1), y € (0,00) and p € (-1,0) (see [3]). For these values of the parameters, 0(x) is upper semi- continuous which is equivalent to Pix) being upper hemi-continuous (see [5], p. 22) also 630 R AL-AYAT & R. FARE X2 = does not imply P{x) = (0) and </> is not increasing in xj. Thus P{x) does not meet the continuity requirement of [1] and [2] nor does it meet the other sufficiency condition of [2] (essentiality of all factors) or [1] (strict increasancy in all factors). Using the proposition above the isoquants of Fix) and Liu) are easily computed to be, ISOQ Pix) = {u\u = (/)(x)} and ISOQ Liu) = [x\(t>ix) = u}. Thus, X € ISOQ Liu) «##► u € ISOQ Pix), showing that under the weak axioms for a pro- duction technology, the sufficient conditions found in [1] and [2] need not hold for a joint pro- duction function to exist. ACKNOWLEDGMENT The authors sincerely thank Professor Ronald W. Snephard for his suggestions and helpful comments. BIBLIOGRAPHY [1] Bol, G., "Produktionskorrespondenzen und Existenz Skalarwertiger Produktionsfunktionen bei der Mehrgiiterproduktion," Karlsruhe, (1976). [2] Bol, G. and O. Moeschlin, "Isoquants of Continuous Production Correspondences," Naval Research Logistics Quarterly, Vol. 22, pp. 391-398, (1975). [3] Fare, R. and L. Jansson, "On VES and WDI Production Functions," International Economic Review, Vol. 16, pp. 745-750, (1975). [4] Fare, R. and L. Jansson, "Joint Inputs and the Law of Diminishing Returns," Zeitschrift fiir Nationalokonomie, Vol. 36, pp. 407-416, (1976). [5] Hildenbrand, W., Core and Equilibra of a Large Economy, (Princeton University Press, 1974). [6] Shephard, R.W., Theory of Cost and Production Functions, (Princeton University Press, 1970). [7] Shephard, R.W., "Semi-Homogeneous Production Functions," Lecture Notes in Economics and Mathemi-Mcal Systems, Volume 99, Production Theory, Berlin, Springer-Verlag, (1974). [8] Shephard, R.W., "On Household Production Theory," ORC 76-24, Operations Research Center, University of California, Berkeley, (1976). . [9] Shephard, R.W. and R. Fare, "The Law of Diminishing Returns," Zeitschrift fur |* Nationalokonomie, Vol. 34, pp. 69-90, (1974). THE PURE FIXED CHARGE TRANSPORTATION PROBLEM John Fisk School of Business State University of New York at Albany Albany, New York Patrick McKeown College of Business Administration University of Georgia Athens, Georgia ABSTRACT The pure fixed charge iransporlation problem (PFCTP) is a variation of the fixed charge transportation problem (FCTP) in which there are only fixed costs to be incurred when a route is opened. We present in this paper a direct search procedure using the LIFO decision rule for branching. This procedure is enhanced by the use of 0-1 knapsack problems which determine bounds on par- tial solutions. Computational results are presented and discussed. INTRODUCTION The pure fixed charge transportation problem (PFCTP) deals with the optimal allocation if supply 5, available at source / = 1, 2, ... , m in order to meet demand D, at destination = 1, 2, . . . , n. Before any goods can be shipped from / to j a fixed charge, /„, must be paid. 'he objective is then to minimize the total cost of shipping available goods to meet the equired demands. One would wish to solve the pure fixed charge transportation problem in those situations here the cost to transport goods over an arc must be paid as a lump sum rather than as per nit costs. An example would be leasing trucks to move goods between supply points and emand points. So long as the amount demanded is less than the capacity to the truck for that Dute, the cost of moving any amount of goods greater than zero is approximately the same, e., the leasing cost, fuel, the driver's salary are fixed. In this case, we would wish to deter- line the set of routes which would allow us to satisfy total demand at a minimum possible sum f these lump or fixed costs. Mathematically the problem may be formulated as follows: m n I) Minimize ^ = Z L /'/->''/ /=! y=i subject to 52 ^'/ = ^1 /■ = 1, 2 m, ./ = ! 631 632 J. FISK & p. MCKEOWN m (3) L^'/ = A y = 1, 2, ... , «, (4) x„ ^ 0, V,, (5) y, = 1 if x„ > otherwise We are assuming that ^ •S', = ^ Z), and that the /,/s are integer. ,=1 /=i In this paper, we will present a direct search procedure for solving the PFCTP which util- izes the unique structure of the constraints, (2) and (3), to derive bounds that lead to an efficient search over the possible solutions. In Section II, we will discuss solution procedures for a closely related problem, the fixed charge transportation problem, as they relate to the PFCTP and suggest a procedure for solving the PFCTP. Section III will discuss development of bounds and presents the iterative procedure. Computational results are presented and discussed in Section IV. II. SOLUTION PROCEDURES FOR THE FIXED CHARGE TRANSPORTATION PROBLEM (FCTP) Discussion of the PFCTP in the literature is limited. Numerous techniques for solving the closely related fixed charge transportation problem (FCTP) are presently available, however. The FCTP can be specified mathematically as follows: m n (6) Minimize ^ = L Z (/</>''/ + ^u^u^ subject to (2), (3), (4), and (5). m n We also assume that X "^z ^ Z ^/ ^"*^ ^^^^ ^^^ fixed costs, /,/s are integer. In this case, we also have the usual per unit transportation costs c,,. Research into solving the FCTP may be classified as either heuristic or exact (algo- rithmic). We will be interested in the latter. Some of these are Murty [15], Gray [6], Ken- nington and Unger [9, 10], McKeown [14], Kennington [10, 11], Barr [2], Frank [5], and Steinberg [17]. With the exception of extreme point ranking procedures such as those presented by Murty and by McKeown, any of the above procedures could be applied to solving PFCTP. Most of these would not be expected to be efficient, either because they are designed to solve more general types of fixed charge problems and do not take full advantage of the spe- cial constraints (2) and (3), or because they have been shown to be efficient only when variable costs dominate fixed costs. Examples of heuristic methods are [1] and [13]. Two procedures which would appear to be useful in solving PFCTP, however, are those of Gray and Kennington. In Gray's procedure, a series of tests are developed which enable him to decrease the number of vertices for which he must find the corresponding feasible solution in order to find a satisfactory assignment of routes given specific values of the logic variables PURE FIXED CHARGE TRANSPORTATION PROBLEM 633 y,/. Kennington introduces a branch-and-bound procedure for solving FCTP which employs a relaxation corresponding to the following Hitchcock Transportation Problem (TP): n n (7) Minimize J^= J^ [{f,jlu,^ + c,y]x,^ /=1 7=1 subject to (2), (3), and (4) and where «,; = min (5",, Dj). The formulation for TP above was introduced by Balinski [1] in an approximate procedure for solving FCTP. Solving TP at each node of his branch-and-bound tree, Kennington is able to calculate effective bounds and to determine simple penalties and feasibilities useful in directing the search procedure. His methodology could be applied to solv- ing PFCTP by simply setting c„ = 0, all for /, j in (7). III. DIRECT SEARCH PROCEDURE FOR SOLVING THE PFCTP Using the terminology of Geoffrion and Marsten [5] we solve the problem PFCTP by separating its set of feasible solutions into subproblems called candidate problems (CP) by assigning values to a subset of the variables (>'„). A particular (CP) is fully defined by specify- ing the elements contained in each of two sets Jq and 7], which represent the set of transporta- tion routes assigned "closed" (i.e., y^ = 0) and "open" (i.e., y,j = 1), respectively. The remain- ing elements reside in the set Jj and are referred to as "free." The sets Jq, J^, and Jj are mutu- ally exclusive and collectively exhaustive. Our enumerative scheme is similar in most respects to a more traditional branch-and- bound scheme employing a direct search (single branch) strategy. In constructing our branch- ing tree, we proceed through successive levels of the tree by choosing a route from the set Jj and assigning it to 7]. We assign this route to Jq in the branching tree only upon backtracking. A strict LIFO (last-in, first-out) backtracking rule is observed. Three factors critical to the computational efficiency of the above approach are (1) the eflFort required to solve for the bound associated with a given candidate problem, and the qual- ity of the bound produced, (2) the specification of rules useful in identifying (CP's) which can- not be optimal, and (3) the choice of a separation variable from among those in J2. Sections Ilia and Illb detail two bounding procedures, the row feasibility test and the column feasibility test. These tests are easily applied and, when used in conjunction with one another, can yield efficient bounds. Section IIIc specifies a test attributable to Hirsch and Dantzig [8] which serves to limit the number of routes assignable to set 7,. Section Illd outlines the rules used for selecting a separation variable from Jj. In the discussion that follows, we introduce a set Js = {-^0 + -^i) which we refer to as a partial solution to PFCTP. i Illa. Row Feasibility Test We define the cost for row feasibility for row /, RF,, in terms of the least cost set of demanders necessary to absorb the supply S, given the partial solution J^. We further define AD, as the total demand assigned to row / given the partial solution J^, where (8) AD, = £ D,y, i and all free variables in row / are assumed closed. If AD, ^ S,, then RF, is simply the total cost of the open routes in row /, i.e., (9) RF^ = i:fuyu- 634 J. FISK & p. MCKEOWN If AD, < 5,, however, we must determine the minimum additional cost which must be incurred in order to satisfy a necessary condition for row feasibiUty. To do so, we solve the fol- lowing 0-1 knapsack problem relative to the set of unassigned routes from supply / : (10) Minimize (11) subject to (12) u„ = 0. 1 V where d, = S, — AD, and Jjii} = the set of unassigned routes from supply /. Then given that assigned demand is less than the available supply for row /, the minimum cost necessary to i obtain row feasibility becomes j ; (13) RF, = Z f,jy„ + n,. ( The applicability of the knapsack relation for determining RF, is based upon the ability to i solve problems such as (10) - (12) with minimal effort [3], [7], [18]. Such relations can yield M efficient bounds and have been successfully applied in solving the generalized assignment prob- lem [16] and in solving warehouse location problems [12]. Since the row feasibility test described above can be applied for each row (supplier) / given the partial solution 7^, the value RF = ^ RF, becomes a lower bound on the sum of j ' / SI fixed charges required for a feasible completion to J^. If this value of RF \s greater than or L equal to Z (the current best known feasible solution), we have fathomed J.. As pointed out previously, the knapsack relation which we employ for determining the bound RF requires relatively little computational effort. Even so, the computational efficiency , I of our procedure increases as the number of such knapsack problems necessary to solve PFCTP it decreases. The paragraph that follows indicates the procedures we employ in order to reduce i| the computational cost of using our knapsack relation. At the initialization of our procedure— when all routes are considered free— we calculate ' RFhy applying (10) - (12) for each row /as previously described, then store the knapsack solu- tion for each such row. Thereafter, as we assign a route to be open or closed, we update the I " bound RF by adjusting the knapsack solution and its objective value for the corresponding row only. Also, upon assigning a route to be open {y,^ = 1) the knapsack relation need be applied . only if: *^ (0 (1) AD, < S, and » (2) the route (/, J) is not one of the assigned open routes in the stored knapsack solu- k tion for its row /. jii: Similarly, upon assigning a route to be closed iy,, = 0) the knapsack relation need be applied only if: (1) AD, < S, and ' I la, lol (2) the route (/, J) is not one of the assigned closed routes in the stored knapsack solu- tion for its row /. PURE FIXED CHARGE TRANSPORTATION PROBLEM 635 Illb. Column Feasibility Test In determining column feasibility for column j, CF,, we use procedures strictly analogous to those described for determining the cost for row feasibility. We define ASj as the total sup- ply assigned to column y given the partial solution /j, where (14) ASj = £ S^y,j i and all free variables in column j are assumed closed. If ASj ^ Dj, then CFj is simply the total cost of the open routes in column j ; i.e., (15) CF,= i:fuy,r i If ASi < Dj, however, we must determine the minimum additional cost which must be incurred in order to satisfy a necessary condition for column feasibility. To do so, we solve the following 0-1 knapsack problem relative to the free variables in column J : (16) Minimize 11^ = J^ f,i\,j (17) subject to ^ 5,v,y ^ d^ (18) v„ = 0, 1 X '6^,1/1 where dj = Dj — ASj and 72(y) = the set of unassigned routes to demand / Given that assigned supply is less than the necessary demand for column j, the minimum cost necessary to obtain column feasibility becomes (19) CFj = £ f,y,j + n,. The rules for applying the knapsack relation (16) - (18) follow closely those defined for rows in the preceding subsection. Also, since this column feasibility test described above can be applied for each column (demander) J given the partial solution J^, the value CF = ^ CFj J becomes a lower bound on the sum of fixed charges required for a feasible completion to J^. The best available bound assignable to the partial solution J^ becomes max (RF, CF). IIIc. Basis Constraint Test Hirsch and Dantzig showed that, for any fixed charge problem, an optimal solution occurs as an extreme point of the (continuous) constraint set. This implies that the x,j values corresponding to a partial solution of the y,/s must be linearly independent and must not be infeasible. Any partial solution which does not satisfy these conditions may be terminated. Also, the maximum number of nonzero elements (i.e., routes (/, j) for which y,j = 1) in a basic solution is m -I- « — 1. Illd. Choice of Separation Variable The separation variable >',.y. is chosen from amongst the sets of variables u* = u* and v* = V*. We define «*as the optimal set of open variables obtained by solving (10) - (12) for each row /for which AD, < 5, given J^, and v*as the optimal set of open variables obtained by solving (16) - (18) for each column j for which ASj < D^ given J^. If n,(y) represents the objective value of (10) - (12) given the closure of route (/, j) in row /, then p,, = n,(y) — 11, is 636 J. FISK & P. MCKEOWN the penalty associated with the closure of route (/, j). Similarly, Xlj^) represents the objective value of (16) - (18) given the closure of route (/, j) in column j, and Qjj = n^(,) - \ij is the penalty associated with the closure of route (/, j). A nonzero penalty need be obtained only for those routes included in w*and v*. Given the determination of the set of penalties p = p^j associated with the closure of each route in w*, the maximum increase in the value of row feasibility /?/" given the closure of any route in u* becomes p^ = max (/7,,). Similarly, the maximum increase in the value for column feasibility CF given the closure of any route in v* becomes q^ = max ((?,,). The ('j)ev» separation variable >',.^. is therefore that currently unassigned variable whose closure would yield the greatest bound associated with J^: (20). r,.^. = max {RF + p„,, CF + q„). In the event that w* and v* are empty sets (i.e., AD, ^ S,, V and AS, ^ Z)y, V^), then the separation variable >',.y. becomes the currently assigned variable having minimum fixed cost. This completes the discussion of the iterative procedure used, and the set of tests employed in order to eliminate partial solutions. For a simple example which illustrates the application of these tests within the procedure, the reader is referred to the Appendix. IV. COMPUTATIONAL EXPERIENCE The algorithm as described here, PURFIX, has been programmed in FORTRAN IV and run on the CYBER 70/74 using the time-share mode. A series of 5 x 5 problems similar to those originally tested by Kennington [11] were run. These problems had uniformly generated supplies and demands over the range 1-999 with uniformly generated costs using various ranges. The cost parameters and test results are shown in Table 1 below. In addition, the Ken- nington code was obtained for use as a benchmark to determine the relative efficiency of our algorithm. These results are also shown in Table 1. All solution times are an average of five problems. Also shown for both procedures is the difference between the fastest and the slowest solution times for each set of problems (the range). TABLE 1 Problem Set Fixed Cost Range Average PURFIX Time Range Average Kennington Time Range 1 257 -457 5.964 12.936 10.897 39.859 2 614 - 814 12.884 21.604 32.285 122.542 3 1328 - 1528 8.069 15.401 33.665 81.796 4 1231 - 3231 2.784, 2.577 4.626 5.507 5 3463 - 5463 5.481 6.212 8.017 12.368 6 34700 - 36700 13.239 26.422 7.645 17.143 7 66400 - 76400 8.042 15.327 34.021 81.823 8 2570000 - 4570000 4.427 9.393 11.532 42.555 All times are in CPU seconds and do not include problem generation. PURE FIXED CHARGE TRANSPORTATION PROBLEM 637 As may be seen in Table 1, PURFIX is faster than the Kennington code in ail cases except one. This case happens to be where the fixed cost range is fairly small compared to the magnitude of the fixed costs. Under these conditions PURFIX would be expected to have ' difficulty distinguishing the optimal solution. In six of the remaining seven cases, PURFIX is at least twice as fast. It can also be noted that for both procedures the range values are fairly large. This implies that the effectiveness of either procedure for pure fixed charge transporta- tion problems is highly dependent upon the particular problem being solved and can vary greatly from problem to problem. As with the solution times, the range values are less for PURFIX in seven out of the eight cases. To test the effectiveness of the PURFIX procedure relative to problem size, we ran six sets of three problems each. All sets were similar except for problem size. The fixed charges were randomly generated with values between and 10 and demands were generated with values between 10 and 100 in increments of 10. The supplies were generated in a similar manner in such a way that total supplies equal total demands. For each problem set, there were jfive supplies but differing numbers of demands. The results from the computational testing is shown in Table 2 with average times (CPU seconds) and ranges being shown for each problem set. TABLE 2 Problem Set Size (m X n) Average Solution Time Range 1 5 X 5 .196 .248 2 5 X 7 .467 1.171 3 5 X 9 2.250 4.045 4 5 X 10 .377 .362 5 5 X 13 2.451 2.963 6 5 X 15 *** *** ' In Table 2 we see that while the number of arcs has a definite effect on solution time, it is lot always the only determinant of difficulty of solution. This is evident from the fact that Dfoblem set four with 50 arcs was solved in less time than that required for problem sets two ind three, each having fewer arcs. PURFIX was unable to solve any problems having 75 or nore arcs in less than an average of 50 seconds. Another factor that could effect ease of solution is the shape of the problem. By this is Tieant the relationship between the number of supplies and number of demands. The problems ested in Table 2 with the exception of Set 1 were all rectangular problems with more demands ban supplies. In Table 3 we have also tested "square" problems, i.e., those with equal numbers )f supplies and demands. All characteristics other than shape were the same as for Table 2. TABLE 3 Problem Set Size (m X n) Average Solution Time Range A 6x6 .597 1.230 B 7x7 2.451 3.489 C 8 X 8 1.873 1.895 638 J. FISK & p. MCKEOWN If we compare the problem sets in Table 3 to problem sets in Table 2 having approxi- mately the same number of arcs, i.e., problem sets 2, 4, and 5, we can get some idea of the effect of shape on ease of solution. However, the comparisons do not show any clear difference in solution times that could be attributed to the shape of the problem.. In summary, these computational results imply that while the size of the problem has a definite eflFect, it appears that ease of solution is highly dependent on some combination of costs and supplies and demands. The exact effect is unclear but definitely deserves further research. [1 [2 [3 [4 [5 [6 [7 [8 [9 [10 [11 [12 [13 [14 [15 [16 [17 [18 REFERENCES Balinski, M.L., "Fixed Cost Transportation Problem," Naval Research Logistics Quarterly, Vol. 8, pp. 41-54 (1961). Barr, R.L., "The Fixed Charge Transportation Problem," presented at the joint National Meeting of ORSA and TIMS in Puerto Rico (1975). Fisk, J., "An Initial Bounding Procedure for use with 0-1 Single Knapsack Algorithms," Opsearch, Vol. 14, pp. 88-98 (1977). Frank, R., "On the Fixed Charge Hitchcock Transportation Problems," (dissertation), Johns Hopkins (1972). GeofTrion, A.M. and R.E. Marsten, "Integer Programming Algorithms: A Framework and State-of-the-Art Survey," in Perspectives on Optimization, GeofTrion, Ed., Addison- Wesley (1972). Gray, P., "Exact Solution of the Fixed-Charge Transportation Problem," Operations Research, Vol. 19, pp. 1529-38 (1971). Greenberg, H. and R. Hegerich, "A Branch Search Algorithm for the Knapsack Problem," Management Science, Vol. 16, pp. 327-32 (1970). Hirsch, W.M. and G.B. Dantzig, "The Fixed Charge Problem," Naval Research Logistics Quarterly, Vol. 15, pp. 413-24 (1968). Kennington, J. and V. Unger, "The Group-Theoretic Structure in the Fixed-Charge Trans- portation Problem," Operations Research, 21, pp. 1142-1153 (1973). Kennington, J.L. and V.E. Unger, "A new Branch and Bound Algorithm for the Fixed- Charge Transportation Problem," Management Science, Vol. 22, pp. 1116-1126 (1976). Kennington, J.L., "The Fixed-Charge Transportation Problem: A Computational Study with a Branch- and -Bound Code," AIIE Transactions, Vol. 7, pp. 241-247 (1975). Khumawala, B.M. and U. Akinc, "An Efficient Branch and Bound Algorithm for the Ca- pacitated Warehouse Location Problem," Management Science, Vol. 23, pp. 585-594 (1977). Kuhn, H.W. and W.J. Baumol, "An Approximative Algorithm for the Fixed-Charges Tran- sportation Problem," Naval Research Logistics Quarterly, Vol. 9, pp. 1-16 (1962). McKeown, P.G., "A Vertex Ranking Procedure for the Linear Fixed Charge Problem," Operations Research, Vol. 23, No. 6, pp. 1183-1191 (1975). Murty, K.G., "Solving the Fixed Charges Problem by Ranking the Extreme Points," Opera- tions Research, Vol. 16, pp. 268-79 (1968). Ross, G.T. and R.M. Soland, "A Branch and Bound Algorithm for the Generalized Assign- ment Problem," Mathematical Programming, 8, 91-103 (1975). Steinberg, D., "The Fixed Charge Problem," Naval Research Logistics Quarterly, 17, No. 2, pp. 217-234 (1970). Zoltners, A. A., "A Direct Descent Binary Knapsack Algorithm," working paper #75-31, University of Massachusetts (1975). PURE FIXED CHARGE TRANSPORTATION PROBLEM 639 APPENDIX For purposes of illustration, consider the following simple example problem: 113 58 23 54 29 59 12 70 69 45 19 92 64 21 Calculation of row feasibility for the first row, RF], requires solution of the following single <napsack problem: Minimize 111 =Owii + 58wi2 + 23wi3 Subject to 92wn + 64i/,2 + 21w,3 ^ 113 u,i = 0, 1, V, The optimal solution to the above problem is u* = (un = 1, W13 = 1, u^j = 0) and ITi = 23. \dditional knapsack solutions can be obtained for rows 2 and 3 so that the following row feasi- !)ility table can be constructed: /■ n, 1 23 2 29 3 12 1 ^^ ^^ 00 1 ^ 1 ^ 1 ^^ Lach cell in the table above can be interpreted as follows: a "1" in the upper diagonal indicates hat the corresponding transportation route is assigned to be "open" in the optimal knapsack blution, while the value in the lower diagonal indicates the penalty associated with closing that oute. Empty cells indicate that the corresponding transportation route is "closed" in the knap- ack solution for its row. Row feasibility is /?F = J^ n, = 23 + 29 + 12 = 64. A table similar 3 that for rows can be constructed for columns as follows: 640 I. FISK & P. MCKEOWN J n, 1 2 58 3 23 1 ^^ ^^ oo 1 ^^ 1 /^ ^^36 Column feasibility, CF = £0^ = + 58 + 23 = 81. i Since an infinite penalty is associated with the closure of route (1, 1), then ywxs set to 1 and assigned to J^. Since column feasibility is now obtained for column 1, the next variable is chosen from rows 1, 2, and 3 and columns 2 and 3. Since p^ = p-<,\ = 57 and q^ = qxi = 41, the next variable assigned to J^ is that currently unassigned variable y,'j* for which /•,.;. = max (64 + 57, 81 + 41) = 122 and y,*,' = yxj- Assigning yxi to one and adding it to J^ simultane- ously satisfies row feasibility in row 1 and column feasibility in column 2. Row feasibility is increased from 64 to 99, since route (1,2) was not in the optimal knapsack solution for calcu- lating RF^. The procedure continues in a similar manner. The solution tree for our example problem is found in Figure 1 below: [oo oo] [64, 122] (1, 1) [64,81] (1, 2) [99,81] [156,81]/ \(3, 1) [99,93] Figure 1. Branching tree for example problem (1,3) [122,93] (2, 2) [122, 122] PURE FIXED CHARGE TRANSPORTATION PROBLEM 641 Note that the route assigned at each level of the tree is in parentheses, while the values for row and column feasibihty are in brackets. The unit flows associated with the solution obtained in Figure 1 are as follows: 73 19 21 45 19 113 45 19 92 64 21 lOptimal solution value is 122. i HEURISTIC ROUTINE FOR SOLVING LARGE LOADING PROBLEMS John C. Fisk State University of New York at Albany Albany, New York Ming S. Hung Cleveland State University Cleveland, Ohio ABSTRACT The loading problem involves the optimal allocation of n objects, each hav- ing a specified weight and value, to m boxes, each of specified capacity. While special cases ot" these problems can be solved with relative ease, the general problem having variable item weights and box sizes can become very difficult to solve. This paper presents a heuristic procedure for solving large loading problems of the more general type. The procedure uses a surrogate procedure for reducing the original problem to a simpler knapsack problem, the solution of which is then employed in searching for feasible solutions to the original problem. The procedure is easy to apply, and is capable of identifying optimal solutions if they are found. I. INTRODUCTION The loading problem involves the optimal allocation of n objects / = 1, 2, ... , n, each aving a given value c, and weight w,, to m boxes, j = I, 2, . . . , m, each having capacity 6y. leveral types of loading problems exist, as indicated in Eilon and Christofides [4]. Two of hese are: PROBLEM 1 (PI): Given that ^ bj '^ ^ w, determine the minimum number of J i boxes required to accommodate all items. PROBLEM 2 {P2): Given that £ *, < X ^' ^or £ 6,^ £ w, but not all objects J ' J i can be accommodated), determine the maximum value of objects accommodated in the boxes. The integer program for problem (PI) can be written as follows: 1) (PI) minimize ^ dj yj j i !2) subject to J) x,j = 1 , V/ 643 644 J C. FISK & M.S. HUNG (3) £ w,x,j ^ b, yj, \fj i (4) >;, = 0, 1 V/, x„ = 0, 1, V/, J where 1 if box j contains one or more objects otherwise 1 if object / is placed in box J otherwise and dj is the cost of box / For d^ = 1, Vy the problem reduces to determining the minimum number of boxes required to hold all objects. If dj = bj, Vy, the above problem becomes that of determining the minimum capacity set of boxes required to hold all objects. For (P2) the integer programming problem is (5) (P2) maximize ]^ c, x,y (6) subject to ^x,j < 1, V / J (7) £ w, x,j < b,, \/j i ^^^ x,, = 0,l.V/, 7 Eilon and Christofides present a heuristic procedure for solving a special case of problem (PI) in which d^ = 1, V,. In addition, they introduce an enumerative algorithm based on the work of Balas [2] which yields satisfactory results. A more efficient algorithmic procedure for this problem which again takes advantage of uniform box costs is presented by Hung and Brown [10]. For solving problem (P2), Ingargiola and Korsh [12] introduce an ordering relation which allows a reduction in the amount of searching required within an enumerative scheme. Hung and Fisk [11] present procedures which rely on Lagrangian and surrogate relaxations to yield good bounds in a branch and bound scheme. Similar procedures have been developed by Mar- tello and Toth [13]. Each of these procedures appears to yield satisfactory results as long as the number of items is small (^ 100) and the number of boxes does not exceed three. This paper presents a simple and effective heuristic procedure for solving loading prob- lems (PI) and (P2) of much larger scale than those that have been attempted before. The pro- cedure is similar to that of Glover [9] in that it uses surrogate constraints to obtain some feasi- ble solutions, but it has two distinctive features usually not found in heuristic procedures. One is that our procedure uses the surrogate constraints to reduce the problems (PI) and (P2) to simpler problems which in fact are the well known knapsack problems. Then we use the solu- tions to the knapsack problems to reduce the set of variables to be considered later on. Another feature is that our procedure will identify optimal solutions if they are found. More specifically, if the reduced set of variables produces a feasible solution then we know that the solution is optimal. SOLVING LARGE LOADING PROBLEMS 645 II. SURROGATE RELAXATIONS FOR (PI) AND (P2) The concept and applicability of surrogate relaxation were introduced by Glover [7], [8] while useful refinements were suggested by Balas [3] and Geoffrion [6]. Surrogate relaxation in its simplest form is to replace a set of constraints by a single constraint (the surrogate con- straint). For example, for problem (PI) a nonnegative vector of real numbers a = (a^) can be used to aggregate the n constraints in (3) into a single one, (9) Y. Z "; ^i ^ij ^ Z «7 ^ yj J ' J Similarly for (P2) , a nonnegative real vector tt = ivj) can be used to combine the n con- straints in (7) into the following, (10) £ Y ^j ^' ^u ^ Z ^J bj ' J J Let (Pla) and (P2^) respectively denote the surrogate relaxations of (PI) and (P2). The relaxations (Pl„) and (P2^) can provide good lower bounds to the optimal solution for their respective original problems given a suitable choice of multipliers. Balas [3], Geoffrion [6] and Hung and Fisk [11] have shown that one suitable choice is to set them equal to the optimal dual multipliers of the aggregated constraints of the linear programming problem. For example, let TT = (Ifj) represent the set of optimal dual multipliers of constraints (6) in the linear pro- gram of (P2). The linear program is obtained by replacing the 0-1 constraints (8) with unit intervals < Xy ^ 1 for all /, / Hung and Fisk [11] showed that if J = c,/w, for all J where t is the smallest object index such that / < r j Items are assumed to be ordered such that ci/wi ^ C2/W2 ^ ... > cjw„. For (PI), if a = (aj) represents the set of optimal dual multipliers of constraints (2) in the linear program of (PI), then: a = dulbu for all j where w is the smallest box index such that I rhe boxes are assumed ordered such that d,/b, < ^2/62 < . . . =^ djb, ^ .^ djb„ Since a, is a constant for all j, constraint (9) can be simplified and the surrogate problem P1-) becomes a single knapsack problem as follows: (P1-) minimize ^ di yj j ^^ subject to £ w, X, = 5^ w, < £ bj yj ' ' j ■'■ yj = (i, 1 V7 i 646 J.C. FISK & M.S. HUNG where: J Similarly, (P2-) has the following simple form: (P2-) maximize ^ c, x, subject to ^ w, X, ^ S ^/ ' ,/ X, = 0, 1. V/ X, = £x„. where: The single knapsack problems defined within (PI5) and (P2-) can be easily solved (see e.g., Ahrens and Finke [1], Fisk [5]). It is clear that an optimal solution to the relaxed prob- lem, either (PI5) or (P2-), may violate the original constraints — (3) in (PI) and (7) in (P2) — because the assignment of items to boxes is ignored in the relaxed problems. How- ever, if a feasible assignment can be found among the items and boxes identified in the optimal solutions to the relaxed problems, then optimal solutions to the original problems are found. Let 5 represent a set of items while T represents a set of boxes. For (PI) let S ={all items) T ={all boxes for which jy = 1 in the optimal solution to (PI-)} while for (P2) let S ={all items for which x, = 1 in the optimal solution to (P2-)} T ={all boxes} When solving (PI), if an assignment of 5" to T'ls found which satisfies constraint sets (2) - (4) then this assignment represents a feasible and optimal solution to (PI). Similarly, when solving (P2), if an assignment of 5 to T is found which satisfies constraint sets (6) - (8) then this assignment represents a feasible and optimal solution to (P2). The following section describes a procedure which searches for feasible and (therefore) optimal assignments of 5 to T. III. AN EXCHANGE ROUTINE FOR (PI) AND (P2) The exchange routine to be described is similar in many respects to the heuristic algo- , rithm presented by Eilon and Christofides. For a particular problem, the procedure uses only {I the sets S and T as previously defined, and searches for an assignment of items which is feasi- ble for the original problem. If such a feasible assignment is found, then it also represents an optimal assignment to the original problem. SOLVING LARGE LOADING PROBLEMS 647 The exchange procedure we use is the following: STEP 1: Solve (PI 5) [or (P2-) if the problem is to solve (P2)] and identify the set of items S and set of boxes T. Place the set of items in 5 in a list in descending order according to weight. The set of boxes in Tcan be placed in a new list in any order. STEP 2: Take the first item in the list and attempt to place it in a (randomly selected) box. If no items are in the list, the optimal solution has been found; stop. If sufficient space remains to accommodate the item in the box chosen, go to STEP 4: If insufficient space remains however go to STEP 3. STEP 3: Attempt to place the item in one of the remaining boxes. If such a box exists, go to STEP 4; otherwise, go to STEP 5. STEP 4: Record the assignment of the item and its weight to the box and remove the item from the list. Return to STEP 2. STEP 5: List the set of boxes in descending order of remaining capacity. Let m^, be the box number for which minimum excess capacity remains. Considering boxes one and two in the list only, attempt a one-for-one exchange of items between boxes such that the available space in one of the boxes is fully utilized. If such an exchange can be made amongst all possible one-for-one exchanges, record it and return to STEP 2. If no such exchange is possible, attempt two-for-one, then one-for-two exchanges, box one to box two. If such an exchange is possible, record it and go to STEP 2; otherwise, go to STEP 6. 1 STEP 6: Repeat STEP 5 for boxes one and three, one and four, etc. to one and m,,, then two and three, two and four, etc. to two and ntg, and so on. If a satisfactory exchange is still not evident, terminate; the heuristic procedure does not yield a feasible solution to the original problem. It is important to note that the above exchange routine will always produce a feasible solu- tion to (P2) and if every item in S is successfully placed in boxes an optimal solution too. Of course when some items in S cannot be assigned to boxes, the solution may still be optimal, but unproven. Furthermore, for (PI) the exchange routine can be modified to always yield a feasible solution. The modification is that in STEP 6 when it seems not all items can be assigned to the boxes in T, T may be expanded to include a box not originally belonging to T. Again in such a case an optimal solution to the original problem (PI) may still be found but we cannot prove its optimality. IV. COMPUTATIONAL EXPERIENCE AND CONCLUSIONS The procedure as described has been programmed in Fortran V code and run on a UNIVAC 1110. While (PI) and (P2) represent different problem situations, the solution pro- cedure we use for each is essentially the same, and the results obtained for one of the problems using our procedure would be expected to reflect closely the procedure's effectiveness in solv- ing the other. For this reason, computational results for solving (P2) only will be presented. The single knapsack routine we use for solving the surrogate relaxation is adopted from Pro- gram /3 of Ahrens and Finke [1]. The brackets [ ] denote the greatest integer less than or equal to the enclosed quantity. 648 J.C. FISK & M.S. HUNG A series of 100 problems consisting of up to 1000 objects and up to six boxes were obtained by generating values and weights independently from a uniform distribution in the interval [10, 100]. Box capacities were then generated in a similar manner except the interval bi < bj ^ ^„ was used where bi = [.4 (Iw,/rt)],' b^ = [.6 (Iw,/«)]. The final box capacity generated, b„, was chosen such that occupancy ratio = l.bi/'Lwi = .5. If bj < min w, or max bi < max w,, the set of generated box capacities were discarded and a new set generated. The occupancy ratio of .5 was used for all problems attempted. Table 1 indicates computation times for our algorithm when solving (P2). For each item/box combination a total of ten problems were attempted, and the total number of optimal solutions to (P2) using the exchange routine was recorded along with solution times. For all problems except one, the optimal solution was obtained upon the first application of the exchange routine. For the remaining problem, the exchange routine was rerun but objects were assigned to boxes according to a (new) random order, at which time an optimal solution was found. As an indication of the relative efficiency of the procedure, the algorithm of Hung and Fisk [10] using a Lagrangian relaxation solved, after 250 seconds CPU time (UNIVAC 1108), only four problems in a ten-problem set containing 200 items and four boxes. The procedure presented here was able to solve all ten of these problems in 4.6 seconds. TABLE 1. Computational Results for (P2) Four Boxes Six Boxes Number of Items Solution Time* Proven Optimal Solutions Solution Time Proven Optimal Solutions** 50 100 200 500 1000 .03/.02/.05 .09/.05/.14 .35/.11/.46 2.0/.6/2.6 7.9/2.1/10.1 10 10 10 10 10 .03/.03/.06 .09/.05/.14 .35/.13/.48 2.0/.6/2.6 7.9/2.1/10.1 9 10 10 10 10 •Average CPU seconds per problem for surrogate relaxation/exchange routine/total **For the one problem not terminating optimally, a feasible solution was found whose solution value was within 1.5% of the surrogate bound As can be seen from Table 1, the computational efficiency of the heuristic is less sensitive to the number of objects in the knapsack algorithm than are other algorithms [4], [10], [11], [12], and [13]. Furthermore, the exchange routine is much less sensitive to the number of boxes than are the other algorithms. As a further example of this, a set of ten problems gen- erated as in Table 1 but having 1000 objects and 10 boxes was solved in about the same number of seconds total CPU time as required by the problem set having 1000 items and 6 boxes. All problems were again proven optimal. In eff"ect, the overall efficiency of the heuris- tic is equivalent to that of a knapsack algorithm. As a final test of our procedure, we chose to solve a series of problem sets containing 200 items and four boxes, generated as in Table 1 but having narrower ranges of item weights and The brackets ( 1 denote the greatest integer less than or equal to the enclosed quantity. SOLVING LARGE LOADING PROBLEMS 649 box sizes. The results are summarized in Table 2. A total of ten problems were attempted for each set, and the number of problems for each range of item weights and box sizes which ter- minated optimally are specified. Table 2 indicates that the exchange procedure remains eflFective as long as some variation exists amongst item weights. The amount of variation in box sizes seems to have little effect upon the ability of the procedure to terminate optimally. That all ten problems terminate optimally for the problem set in which no variability is allowed within either item weights or box sizes is apparently fortuitous. For those problems in Table 2 which did not terminate optimally, feasible solutions were obtained having solution values which were in every case within 2.1% of the surrogate bound. TABLE 2. Effect of Variation in Box and Item Weights Upon Ability to Terminate Optimally Range of Item Weights Range of Box Sizes Ab, ^ bj ^ .6b,* A5b, < bj < .55b, bj = .5b, 10-100 10** 10 10 25-85 10 10 10 40-70 10 10 10 55 10 *b, = Ih;,/4 **Number of problems terminating optimally As seen from Tables 1 and 2, the exchange routine described here appears to be quite effective in obtaining provably optimal solutions to loading problems of the type (PI) or (P2) in which a number of items and boxes are large and at least some variation exists in item weights. For smaller problems, and for problems in which little or no variation in item weights exists, available optimizing procedures may be more appropriate. REFERENCES i, [1] Ahrens, J.H. and G. Finke, "Merging and Sorting Applied to the 0-1 Knapsack Problem," Operations Research, Vol. 23, pp. 1099-1109 (1975). [2] Balas, £., "An Additive Algorithm for Solving Linear Programs with Zero-One Variables," Operations Research, Vol. 13, pp. 517-546 (1965). [3] Balas, E., "Discrete Programming by the Filter Method," Operations Research, Vol. 19, pp. 915-957 (1967). [4] Eilon, S. and N. Christofides, "The Loading Problem," Management Science, Vol. 17, pp. 259-268 (1971). [5] Fisk, J., "An Initial Bounding Procedure for Use with 0-1 Single Knapsack Algorithms," Opsearch, Vol. 14, pp. 88-98 (1977). [6] Geoffrion, A., "An Improved Implicit Enumeration Approach for Integer Programming," Operations Research, Vol. 17, pp. 437-454 (1969). ![7] Glover, F., "Surrogate Constraints," Operations Research, Vol. 16, pp. 741-749 (1968). [8] Glover, F., "Surrogate Constraint Duality in Mathematical Programming," Operations Research, Vol. 23, pp. 434-451 (1975). [9] Glover, F., "Heuristics for Integer Programming Using Surrogate Constraints," Decision Sciences, Vol. 8, No. 1, pp. 156-166 (1977). 650 J.C. FISK & M.S. HUNG [10] Hung, M.S. and J.R. Brown, "An Algorithm for a Class of Loading Problems," Naval Research Logistics Quarterly, Vol. 25, pp. 289-297 (1978). [11] Hung, M.S. and J.C. Fisk, "An Algorithm for 0-1 Multiple Knapsack Problems," Naval Research Logistics Quarterly, Vol.25, pp. 571-579 (1978). [12] Ingargiola, G. and J. Korsh, "An Algorithm for the Solution of 0-1 Loading Problems," Operations Research, Vol. 23, pp. 1110-1119 (1975). [13] Martello, S. and P. Toth, "Solution of Zero-One Multiple Knapsack Problems," presented at the ORSA/TIMS National Meeting, Atlanta (1977). SEARCH FOR AN INTELLIGENT EVADER 653 are applicable only when certain relationships obtain between the detection probabilities for the various regions. In Section 3, a continuous-time version of the problem is described for which it transpires that P* = Pg. This continuous-time version is asymptotically equivalent to the discrete-time version as the detection probabilities tend to zero. Section 4 summarizes our investigation of the relationship between P* and Pg for different values of N and of the detection probabilities. P^ is a particularly good approximation to P* when the 9, (/ = 1, 2, .. , A^) are either all sufficiently large, or all sufficiently small, or not too dissimilar. Furthermore, if the range of the 9, values is held more or less constant, the accuracy of the approximation does not vary greatly with A^. The case of N = 2 may as a result be used as a point of reference, and a method of doing this is described. Finally, in Section 5 we assess the A'-region problem as viewed by the other player, the searcher. Although it transpires that the searcher's optimal strategy is likely to be difficult to determine exactly, this section shows that satisfactory approximations to it are usually determin- able without too much difficulty. 2. THE DETERMINATION OF P' AND V(P*) This section describes three distinct approaches to the evaluation of P* and ViP*). The first is quite general and it may be used for any problem of A'^ regions, where A'^ is limited only by the capacity of one's computational facilities. There are no restrictions on the values of the escape probabilities. The accuracy of P* and ViP*) which this approach yields can be made precise to any arbitrary degree. The second method applies to problems which are multiples of smaller problems whose characteristics are already known. For example the problem (r\, r^, r^, r^, r^, r^,) ^'' is related to the smaller problem {r^, r^) if ri, ri, r^ are all equal to r^, and r^, r^, r^ are all equal to r^: that is, the former problem consists simply of three blocks of the latter. So if P* and V{P*) have been established for the smaller problem, this approach indicates how these same characteristics may be obtained for the larger problem. The third approach is applicable to problems of any A^ where the escape probabilities assume only two values and where one escape probability is an integer power of the other. These conditions lead to particularly simple expressions for f* and V{P*), and these expres- sions yield useful bounds for ViP*) in situations where such conditions are not satisfied. (i) A general method We begin by outlining a procedure for finding the expected payoff ViP) at any vector P, by assuming that the searcher always plays optimally; that is, he consistently searches that region with the greatest current p,^,. To simplify the exposition, assume that if / ^ J, then PiQi ^ PjQj, both for the original vector P and for the vectors into which it is transformed by (1). Suppose then, without loss of generality, that P is such that P\Q\ > P2Q2 > •••• > PnQn- 654 J.C. GITTENS & DM. ROBERTS Let biji^ be the number of searches in region / before the /c-th search of region j. From (2), assuming the searcher's policy is as specified above, we can write PJi (*...-i) A:-l Q, > PjrJ Qj. and Suppose X is such that Hence and We can therefore write *-! P,r,' Q, < PjrJ Qj. P.rf q, = Pjff ' qj. X = [\og(,pjqjl p,q^ + (/c - 1) log /-yl/log r, bijk - 1 < X < bijk- log (Pjqj/p,q,) + (k - 1) log r. l>Uk - 1 = Int log r, (3) = Int log {p,qilp,q) n, - + Kk - I) — log r, where Int denotes the integer part, and we take r, and /-y to be related by r, ' = r, ', where «, and tij are any numbers, not necessarily integer. Next we define Vji = E(no of searches in region j\ evader in region /). V, = £ (total no of searches | evader in region /) N 7 = 1 From the definition of q,, V„ = l/q,. Otherwise, we can use the definition of bij^ to write expressions for Vij in the following two situations: (a) / < j Kj = b,ji + rj{b,j2 - b,j^) + rj- (b,j3 - b,j2) + r/ (b,j4 - b,ji) + . . . = qjib,ji + rjb,j2 + r/ b,p + ...). To a first approximation, using (3), b,k = ^,1 + ik-\) — . So therefore V, = b,, + qj-^ £ (/c - 1) r} k-\ •J k=2 THE SEARCH FOR AN INTELLIGENT EVADER CONCEALED IN ONE OF AN ARBITRARY NUMBER OF REGIONS J.C. Gittins University Mathematical Institute Oxford, England D.M. Roberts Ministry of Defence London, England ABSTRACT This paper considers the search for an evader concealed in one of an arbitrary number of regions, each of which is characterized by its detection pro- bability. We shall be concerned here with the double-sided problem in which the evader chooses this probability secretly, although he may not subse- quently move; his aim is to maximize the expected time to detection, while the searcher attempts to minimize it. The situation where two regions are involved has been studied previously and reported on recently. This paper represents a continuation of this analysis. It is normally true that as the number of regions increases, optimal strategies for both searcher and evader are progressively more difficult to determine pre- cisely. However it will be shown that, generally, satisfactory approximations to each are almost as easily derived as in the two region problem, and that the ac- curacy of such approximations is essentially independent of the number of re- gions. This means that so far as the evader is concerned, characteristics of the two-region problem may be used to assess the accuracy of such approximate strategies for problems of more than two regions. . INTRODUCTION In a recent paper — Roberts and Gittins [5], hereinafter referred to as R & G — an malysis was given of a search problem involving two regions. The analysis is extended in this taper to problems of similar type but with an arbitrary number of regions. Such problems may te described as follows: Suppose a stationary object is hidden in one of A^ distinct regions. The probability of its leing concealed in region / (/ = 1, 2, . . , A^) will be denoted by /?,, and the location probabil- ty vector by P = iP\, Pi. ■■ . Pn)- 'ach region is characterized by its detection probability q, which is the probability that a search >f region / will discover the object if it is there; to avoid unnecessary complications, suppose < q, <l. 651 652 J.C. GITTENS & DM. ROBERTS Often it will be advantageous to specify a region in terms of its escape probability r,, where /•, = 1 — q,. We assume that the time taken to search any region is constant, and take this con- stant to be the unit of time. From Bayes' theorem it follows that an unsuccessful search of region j changes the loca- tion probability vector as shown below. (1) 'Pj^^lZ^ "i 1 ' 1 - PjQj J Pi Pi — -; for all / J^ j. It has been shown by, among others, Black [1] that the strategy which at any time searches the region with the greatest current value of p,*?, minimizes the expected time until the object is found. For a given initial F this minimum is denoted V{P). Usually such a strategy is deter- ministically defined, and as such can be considered as pure. However there are occasions when PiQi is maximized for more than one value of /. Clearly in these circumstances the searcher can i choose between pure strategies, each of which will lead to the minimum expected time to detection. We shall adhere to the terminology used in Norris [4] by referring to these pure strategies as 'good' strategies. We can extend equation (1) to determine that the transformation due to a sequence of ' searches which involves a total of [k] searches of region /, for each /, will be as follows: "■' ""' 0) fl ^ - — -• 7=1 Significantly, the order of the sequence has no effect on the final transformation. In R & G this single-sided problem was considered in the form of a double-sided search problem in which the initial value of P is chosen secretly by the object (or evader). In this form the problem is a zero-sum game between the searcher and the evader, the payoff to the evader being the time which the searcher takes to find him. This game has been considered by Bram [2] who showed that it does possess a value, and that therefore the appropriate strategies for the searcher and the evader are the minimax and maximin strategies respectively. It is easy to show that if the evader is allowed to move quite freely between searches then his maximum strategy is given at each stage by the location probability vector Pg, defined such that p,q, is the same for all /. In R & G it was shown that when TV = 2, Pg is a remarkably good approximation to the evader's maximum strategy P* for the more complicated, and often more realistic, problem in which the evader, once hidden, remains stationary. This observation also led to nearly optimal search strategies. The significance of these approximations is that they are much more easily calculated than the exact solutions. In this paper we show that for arbitrary values of N the approximation P* = Pg remains extremely good under most cir- cumstances, and this is our most important conclusion. Detailed methods of calculating P* to any desired accuracy are given in Section 2. One ol these can be used for any set of detection probabilities. The others are abridged methods which SEARCH FOR AN INTELLIGENT EVADER 655 Similarly, to a second approximation, b,, = b,n+{k-2) — , b,n + — n, Qin j"j Kj = Qjb,j} + rj So for the /-th approximation ^U = QjbijX + (ljrjb,j2 + qjrf b,ji + . . + rj-^ bill + w, 9y«y (b) / > j v.. = (r/' -r/^) X 1 + (r/^ - r'j"') x 2 + (r/^ - r^'^) x 3 = rj"^ + r/^ + r/^ + . . . To a first approximation, again using (3), F„ = /•/" (1 + /•, + r,2 + . . . ) "yvi 9/ Similarly, to a second approximation, F,^ = r/'' + /-/'^ (1 + r, + r,^ + = rf'" + ^ ji2 And for the /-th approximation Qi V ■ = r"' + r-^'^ + + — — '^ ij 'j ^ 'j n ' Hi Thus we have established a series of successive approximations for F,(P), depending on the value of / up to which 6^/ is given its exact integer value. Since (4) v(P) = YPi^i(P) i we can thus calculate V{P) to any desired accuracy. In an examination of problems of various sizes, it was found that / = 10 gave very good results indeed — typically resulting in a V(P) accurate to within ± 10""* of its true value. If / = 20 is used, then understandably ViP) is much closer to its true value; it was always observed to be within ± 10~^ and frequently far closer. Having found a precise method of calculating ViP) it is necessary next to find an approach to evaluating P'and hence V(P*). Since Pg is always easy to calculate, and is always .close to P* if not actually equal to it, an obvious approach is to proceed as follows. Starting 656 J.C. GITTENS & DM. ROBERTS from the vector P^, and having calculated V{Pg), form a new vector P by increasing its first component by, for example, 0.01, and decreasing each of its remaining components by an amount proportional to the component's magnitude. If V{P) > ViPj form a new vector by modifying P as Pg was modified. Continue thus until a maximum is reached. If V(P) < V{Pg), search in the opposite direction. When a maximum is reached, at P' say, (or if no payoff greater than ViPj is found in either direction) move from there (P' or Pg) by changing the vector's second component, and so on. When all directions have been examined, the entire procedure can be repeated using progressively smaller increments until an acceptable P* is found. There are two reasons why this procedure invariably leads to an acceptable P*. First, using an obvious extension to the argument in Section 2 of R & G, it may be shown that V(P) is continuous and a concave function of P ; hence, the problem of a local maximum does not arise. Second, r, < 1 for all /, and this means that P* is an interior maximum point; that is, each of its components is positive. In the illustrative examples of Sections 4 and 5 the procedure was continued to the extent of using increments of 0.0001. When this sensitivity was combined with a value of / =■■ 30 in determining V{P) one arrived at a value of ViP*) which, whenever checked against known V{P*) (determinable as described in Sections 2(ii) or 2(iii)), was accurate to within 0.00004. This procedure can prove to be inefficient if several regions have the same escape proba- bility. In such cases it seems advisable to increment all the components of P associated with these particular regions together. This keeps such components equal, not an unreasonable pol- icy when one considers that at P* they will of course be equal. It may be advisable to observe similar precautions also when several regions have approximately equal escape probabilities. When the incidence of regions with the same escape probability is such that the search problem may be regarded as the result of the joining together of a set of sub-problems, this modification to our general procedure may be linked with a simplified method of calculating V(P). This situation will be discussed in Section 2(ii). (ii) Problems with identical blocks of regions Consider a problem for which there is an integral number k of blocks of regions, where the blocks (each of M regions, say) are identical in all respects. The location probability vector P^ for such a problem may be expressed, with an obvious notation, in the form P+= (a/: i = \. 2. . . , M-J = \. 2, . . . k). The expected payoff if the evader uses the strategy P^ and the searcher plays so as to minimize the expected time to detection will be denoted by V,,xf{P'^). We shall use V\f(P) to designate the similar function for the reduced problem in which the evader is restricted to one particular block of M regions and plays the vector P = {p,; I = I, 2, . . , M). We shall say that P^ corresponds to f* if p,t = P,lk, i = \. 2, . . , M\ J = \. 2, . . . k. In such cases the calculation of V/cM^f*^^ 'S simplified by the following result. THEOREM 1: If i'+ corresponds to Pthen V,^iP^) = k Vm(P) - y •-i SEARCH FOR AN INTELLIGENT EVADER 657 PROOF: The expected number of searches when the evader is hidden in the j th block is kVxfiP) - ik -j). The probability that the evader is actually hidden in the 7th block is clearly \/k. Hence Vm(P^)-'L\[kVjP)-ik-j)} = k Vm(P) - T •4 .*+ COROLLARY: P "^ corresponds to P* and 'miP*-") = k Vm(P*) - '-'k Here P*^ and P* are the evader's maximum strategies for a /c-block problem and the related single-block problem respectively. The proof is immediate from Thfeorem 1 and from the observation that p*^j must be independent of j for all /. It follows that the values of P* and ViP*) for any M-region problem may be used to calculate similar quantities for problems defined by adding identical blocks of M regions. (iii) Problems with just two escape probabilities, one a power of the other Consider a problem with regions (1, 1), (1, 2), . . , (1, A), (2, 1), (2, 2), . . , (2, B), with Qu = Qn= ■ ■ = Q\A = Qi = '^ - r\, Q2\ = 922 = • • = QlB = <?2 = 1 = ''2. and with r" = r2, where n is an integer. The components of Pg are P\, = Q2lUqi + Bq^), i = \, 2, . . , A, Pi, = qx/(Aq2 + BqO, / = 1, 2, ... 5. For a problem of this type, the infinite series which, as described in Section 2(i), arise in the calculation of ViPj may be summed explicitly, leading to the expression ViPo) = ^A(A+l)q2 + jB{B+l)qi + ABq2 + A^r^ nr" q\ + {An + B) — . qi Aq2 + Bqi Clearly the point P* must be such that P\i = P\j' 1 < iJ < ^. and P2, = Pij^ 1 < iJ < B. ]!losed expressions may also be obtained for V{P) for other vectors P with these properties. Consider for example the vector P' with components p[, = q2/iAq2 + Bq^rx), i = I. 2, . . , A, Pii = q\r\liAq2 + Bq^rx), i = I, 2, . . , B. 658 JC GITTENS & DM. ROBERTS Thus the strategy for the searcher which minimizes the expected time to detection corresponding to the evader's having chosen to play P' starts with searches of each of the first A regions, and it follows from equation (2) that this transforms the vector P' into the vector Po- We have V{P') = Kevader is found in one of the first A searches) x £(no. of searches required | evader is found in one of the first A searches) + /•(evader is not found in one of the first A searches) x ^(no. of searches required | evader is not found in one of the first A searches) = M\Q\ X y (^ + 1) + (^PiVi + Bp2{) X {V{P„) + A). Similar closed expressions may be obtained for V{P) for all those vectors P which have the symmetries stated above, and which may be transformed into the vector Pg. These are the values of P which are of most interest, since P* must be one of them. This may be shown along the lines described in R & G for the case when N = 2. 3. THE SEARCH PROBLEM IN CONTINUOUS TIME The N regions are now characterized by their detection rates X, (/ = 1, 2, . . , A^). As before the evader may not move, but the location probability vector changes by Bayes' theorem as the search progresses. Its value at time / is denoted by {p\it), PiU) , ■ ■ , Pi^it)) and Pg is the vector for which Pik, is a constant. In continuous time it is natural to allow the searcher to divide his effort at any given time between the A^ regions. If w,(r), a piecewise continuousf function of /, is the proportion of his total effort allocated to region / at time t [ui(t) + UjU) + . . + «/v(/) = 1], the probability of the evader being detected in the time interval (/, t + S/) if he is in region /, and conditional on detection not occurring before time /, is ^^^ \,u,{t) bt + o{.ht), except at a point of discontinuity of the function w,(r). The probability of detection in (/, t + 8/) if the location of the evader is given by the initial probability vector, and conditional on detection not having occurred before time /, is therefore ^^^ X\,w,(?)A(r)8r + o(8r); (=1 and for the sake of convenience, we shall write this in the form p{t)8t + oibt). The expres- sions (5) and (6) lead respectively, and by elementary arguments, to alternative expressions for y(/), the probability that detection does not take place before time t. We have ^^^ ?{t) = 2:^(0) exp[-\,U,U)] =exp[- J^'pCs)^^], (=1 where U,{t) = f'u,is) ds. *'0 Also if ris the time to detection it is well known, and easily shown by integration by parts, that (8) E{T) = C ^U) dt. tit is not difficult to modify the argument so that u, (r) is required only to be a measurable function of t. J.C. GITTENS & DM ROBERTS 659 As for the discrete-time case, the evader's strategy is P = (Pi(0). P2(0), . .. /^/vCO)). The searcher's strategy is the vector function u which for any t > takes the value («i(r), U2U), . . , U/v^(/)). We shall denote the evader's maximum strategy by P* and proceed now to prove - THEOREM 2: For the continuous time problem P* = P„. PROOF: We note firstly that it follows from equations (7) and (8) that Em = JJ^ £ a(0) exp[-X, U,(t)] dt. i=\ It may be shown that this expression is minimized if and only if the searcher uses the continu- ous time version of the rule that he should search the region with the greatest current p,qi\ specifically — M,(r) > ^#> X, p,U) = max \jPjit) i except possibly at a set of times of Lebesgue measure zero. This result follows from a varia- tional argument similar to that used by Gittins [3] for another resource allocation problem. Alternatively it may be established using the concept of uniform optimality discussed by Stone [6]. Under such a rule there is clearly some tx{< °o) such that r, = inf [t: XiPiit) = \2P2(t) = ■ • = >^r^pM}, and / ^ (9) P,U) = \-y X k-\ i = 1. 2. ... N- t ^ r,. From Bayes' theorem we have / ^ P,{t) = p,iO) exp[-\, U,it)] / £p/0) exp[-X, UjU)]. I 7=1 / = 1, 2, . . , yV; r ^ 0, so that XiPi(O) exp[-Xi Uxit)] = X2/72(0) exp[-X2 U2U)] = . . . ^^^^ =X;vPyv(0) exp[-Xyv t//v(0], t ^ r,. Now diff"erentiating equation (10) with respect to /, and dividing by (10), gives u,{t) = Xf' / £ X-', i = \.2, . . , N\t ^ /,. Thus from (6), 1 ■r pit) = N . t ^ t i 660 SEARCH FOR AN INTELLIGENT EVADER A similar argument shows that "« p«) > N A EV -1 t < t. From equations (7), (8), (11), (12) it follows that if the searcher uses a continuous time ver- sion of the principle that in order to minimize the expected searching time, he should search that region with the greatest current /j,^,, then E{T) ^ £\-', 7 = 1 with equality if and only if fj =0. Now /i = if and only if p,(0)X, is a constant; that is to say if and only if P = Pg. Thus the evader's maximum strategy is P^ and the theorem is proved. 4. STRATEGIES FOR THE EVADER The Two-Region Problem and Related, Larger Problems From an examination of even a few A'-region problems, one very soon concludes that several of their characteristics depend on the value of A^. Any description is therefore perhaps best undertaken in terms of increasing A'. And we shall begin with the simple two region prob- lem, /"i = 0.8, ri = 0.512. (This problem has, as it happens, been chosen such that r\ = r2 but, as we shall see later, this has not in any way affected the generality of the conclusions). For this particular problem there are no fewer than three ways of finding V{Pg)^ V{P*) and P*. (Pg of course is always simply determined from PiQ, being a constant). First we note that the method of Section 2(iii) may conveniently be followed. Second, the general approach of Section 2(i) can be used, although usually we cannot expect to arrive at an exact ViPg) (though in this particular case, we do, fortuitously) nor an exact P* and V{P*). Third, as in very many two-region problems (see R & G) a dynamic programming method, that is first described in Norris [4], is entirely feasible. So for this problem, we have ViPg) =6.51067 , V(P*) = 6.53006 , V(P*) ViPo) = 1.00298 If in our examination of the effects of increasing A^ we see this as a starting point, then clearly we can look at a whole range of four-region problems, all related to this basic two-region problem in the sense that the largest and the smallest of the four escape probabilities are 0.8 and 0.512. It is often convenient to classify A'-region problems in terms of the largest and smallest escape probabilities; we shall refer to these as r/ and r^ respectively. SEARCH FOR AN INTELLIGENT EVADER 661 Four of these problems are listed in Table 1 . TABLE 1. EXAMPLE r\ ri ''3 r^ V{Po) V(P*) V(P*)/V(Po) 1.1 0.8 0.8 0.8 0.512 15.501 15.526 1.00160 1.2 0.8 0.8 0.512 0.512 12.521 12.560 1.00310 1.3 0.8 0.512 0.512 0.512 9.574 9.609 1.00362 1.4 0.8 0.7 0.6 0.512 11.312 11.327 1.00127 Three features are immediately evident from the table: (i) The largest V(P*)/ViPg) exceeds the corresponding ratio in the two-region prob- lem. (ii) This occurs in a problem where only the largest and the smallest escape probabili- ties are present — example 1.3 (problems where r2, /"s are distributed between /■] and r^ have ViP*)/ViPg) ratios which are very much less — see for example 1.4). (iii) Example 1.2 consists of two blocks of our original two-region problem. For this example V(P*) and ViPg) — as well as of course P* — could have been derived from the corresponding characteristics of the two-region problem using Theorem 1. These features continue to manifest themselves as A^ increases; to eight for instance, as in Table 2. (Unlike Table 1, this table does not contain all the frequencies of /•/ and r^; however the example for which V{P*)/V(Pg) takes its maximum value has been included). TABLE 2. EXAMPLE ''1 ''2 r6 ''3 iVPg) (VP*) ViP*) ViPo) 2.1 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.512 33.498 33.525 1.00081 2.2 0.8 0.8 0.8 0.8 0.8 0.512 0.8 0.512 30.503 30.553 1.00163 2.3 0.8 0.512 0.8 0.512 0.8 0.512 0.8 0.512 24.543 24.620 1.00316 2.4 0.8 0.512 0.8 0.512 0.512 0.512 0.512 0.512 18.649 18.718 1.00372 2.5 0.8 0.65 0.75 0.6 0.7 0.55 0.65 0.512 21.188 21.198 1.00047 Specifically, the largest ViP*)/ViPg) ratio continues to exceed the corresponding ratio in the two-region problem; also it has increased slightly. Moreover, it occurs as previously in a problem where the only escape probabilities present are either r/ or r^ — example 2.4 of Table 2. Once again we find some eight-region problems corresponding to four-region problems, or even, in one case, to our basic two-region problem (examples 2.4, 2.3, 2.2 to 1.3, 1.2, 1.1 respectively; example 2.3 is also contiposed of four blocks of the two-region problem). I 662 J.C. GITTENS & DM, ROBERTS If we look once again at example 2.4 we will see that it consists of two identical blocks each of four regions, where each block is that of example 1.3. If now we go on to consider the 12-region problem r\ = r2 = r^ = 0.8, (A) r4 = ^5 = . . = /'12 = 0.512, then using Theorem 1 (noting that this problem consists of three identical blocks) we can determine that V(P*) Significantly as the number of blocks increases still further this ratio will certainly increase, but will never exceed 1.00382. Likewise from our basic two-region problem we can infer that for all problems of jV regions, where half of the regions have escaped probabilities 0.8, and the other half 0.512, V{P*)IV{P„) is bounded above by 1.00322. From Tables 1 and 2 we note that those examples with the largest ratios ViP*)/V(Pg) both have three times as many regions with escape probability 0.512 as with 0.8 (examples 1.3, 2.4). This seems to be fortuitous. For in the 12-region problem the largest V{P*)lV{Pg) ratio occurs in the example /■] = r2 = /"j = ^4 = 0.8, (B) r,= . . . = r,2 = 0.512, although its value (1.00378) exceeds only slightly that of example (A). The N-Region Problem — General Conclusions Were we to have started with any other pair of escape probabilities as our basic two-region example, and proceeded to examine associated problems up to 12 regions, then almost certainly we could have observed features similar to (i), (ii) and (iii) above. In the course of this study such a procedure was carried out about a dozen times on a set of problems intended to give as representative a picture as possible. In each case this observation was found to apply, and, as might be expected, such an analysis enables one to be more explicit in one's conclusions. Specifically, the largest [V{P*) — V{Po)]lV{Pg) increases with increasing N. It exceeds that of the two-region problem, but is generally less than twice as large. Exceptions do exist, although they seem to be confined to problems where one (and sometimes, though rarely, two) escape probabilities are large (>0.85 say) and all the rest are by comparison quite small (about 0.1 or less); for example in the 12-region problem ri = 0.95, r2= . . = rn = 0.1285, V{P*) is about 4'/2% greater than V{Pg), whereas in the related two-region problem /-I = 0.95, /-2 = 0.1285, ViP*) is about 1% greater than V{P„). SEARCH FOR AN INTELLIGENT EVADER 663 In two-region problems where V{P*) and ViPj are equal, one finds that in larger, related problems, the same generally applies. Although this was not found to be invariably so, such diflFerences as were observed were always of the order of 0.2% or less. We are therefore able to say something about the ratio of V{P*) to ViPo) in A'-region problems (where N takes values up to 12 at least) provided we possess certain information about the related two-region problem. The sort of information required is shown in Figure 1. We can see there that the two-region problem r, = 0.8, /-2 = 0.2 possesses a ViP*)/ViPg) ratio of about 1.015. What can we say then about the 12-region problem ri = 0.8 > 7-2 > . . . > r,, > /•i2 = 0.2 ? First, if r2, .... /"n are distributed at all within this interval, then the ratio of ViP*) to ViPo) is almost certain to be very much closer to one. Second, all such 12-region problems — even those consisting solely of the two extreme escape probabilities r/ and r, — will have a ViP*) within 3% of V{Po)\ as indicated above such exceptions to this conclusion as do exist tend to be well defined, and this is certainly not one of them. In any class of problems of the same size N, and characterized by the same /■/ and r^, the problem with the largest proportional difference between V{P*) and V(P„) will be one of those N — \ problems whose escape probabilities are r/ and r^ only; for instance example 1.3 of Table 1. What justification is there for proposing this conjecture? First, considerations of symmetry assure us that for any A^-region problem with all regions having the same escape probability, P„ and P* are the same. (One could call this a perfectly balanced problem). Second, where several escape probabilities are involved and where /•/ and r^ are respectively the largest and the smallest of them, it can be seen that a "less balanced" problem can be created (i.e., one moves further away than one already was from the "all escape probabilities equal" situation) by moving some or all of those not equal to r/ or r^ to one extreme, with the remainder being moved to the other extreme. This conjecture has been examined in a number of cases, and has been found always to be correct. Special Relationships Existing Between Escape Probabilities All of the conclusions discussed so far were described using as a basis examples where the largest and smallest escape probabilities (r/ and rj were such that the ratio of the logarithms of ri and r^ is an integer. In Table 1 for instance inevitably this inclines one to ask whether such a relationship has affected our conclusions in any way at all. This question has been considered quite extensively, but no evidence has been found to suggest that such relationships have any influence. The behaviour of V{P*)/ViPo) as a func- tion of the escape probabilities has been explored where no integer relationship exists between these probabilities. The function's general characteristics were found to be indistinguishable from those observed where integer relationships do exist. 664 J.C. GITTENS & DM. ROBERTS CONTOURS SHOW THE RATIO V{P*) V(Po) V(P*) V{PJ < 1.0075 Figure 1. The two-region problem 5. STRATEGIES FOR THE SEARCHER We have confined our attention so far to an analysis of what the evader should do in par- ticular situations. We saw that very often he can be satisfied with merely calculating Pg, know- ing that V(Pg) is so close to V(P*) as to be adequate for his purposes. This suggests that the searcher might apply one of the good strategies vis a vis Pg. While this would often be a satis- factory procedure, such a strategy generally turns out to be less adequate than is Po'foT the evader. However the searcher may always proceed via the calculations described in Section 2(0. SEARCH FOR AN INTELLIGENT EVADER 665 In making these calculations he will of course determine an arbitrarily precise approxima- tion to P* (which for convenience we shall refer to as P^). However he will also have deter- mined the components ViiP'^) from which, using equation (4), V(P'^) is derived: viz V(P^) = Ia^ KiP^). A typical set of components for an eight-region problem is shown in Table 3. TABLE 3. /■ 1 2 3 4 5 6 7 8 r. 1 0.8 0.75 0.7 0.65 0.65 0.6 0.55 0.512 V.(P'^) 21.2907 21.031 21.1195 21.7895 20.7895 21.1753 21.1265 21.2774 ViiP'*') is defined in Section 2(i) as the expected payoff assuming that the evader is actually hiding in region / ; it is shown as a function of P'^ merely to signify that it corresponds to a good strategy on the searcher's part vis a vis an evader strategy P^. This search strategy hap- pens to be pure, for V,{P'^) was calculated on that basis. Significantly, at P* itself there will be several of these good strategies. The searcher's minimax strategy is one which ensures that the expected payoff is not greater than the value of the game, irrespective of what strategy the evader has played. It is a mixed strategy obtained by randomizing over those strategies which are good against P*. The randomization yields a set of V,{P*) each equal to the value of the game ViP*). For the example of Table 3, ViP'^) is 21.1985. It is interesting to note by how little any of the V,iP'^) differ from V(P+). In particular the largest of them (21.7895) is only 3% greater than V{P^), which by the definition of P* is less than or equal to ViP*). This means that if the searcher plays the good strategy vis a vis P^, then he limits the evader to a maximum expected pay-off of 21.7895, whatever strategy the latter happens to have chosen. The situation we have just described where the largest ViiP'^) is only a few percent greater than V(P'^) is quite common, and seems invariably to be the rule where the escape pro- babilities are well spread out between /•/ and r^ as in Table 3. (This is in contrast to the some- what clustered form of examples 2.3 and 2.4 of Table 2). Should this situation not apply, then the searcher might be advised to consider points in the vicinity of P^ with a view to finding one where it does. Significantly also, having calculated the ViiP) for several values of P, there is always the possibility that by randomizing over some of the associated pure strategies, he can effectively decrease still further the variation with respect to /. The reason for this is that if the searcher plays the good strategy associated with the vector Pj with probability njij = \, 2, ... M), then the expected time to detection, given that the evader is in region /, is M I 77, V,(Pj). j=\ Randomizing over different good strategies associated with the same vector P has a similar effect. We can see this by referring once again to Table 3. The pure strategy corresponding to /*"•" searches region 4 prior to region 5 even though r^ = r^, this accounts for the difference of 666 J.C. GITTENS & DM. ROBERTS one between V^(P'^) and VsiP'^). But we could equally well have chosen to search region 5 before region 4. By selecting either pure strategy with probability 0.5 the searcher can effectively modify Table 3 so that V^(P+) = Vs(P^) = 21.2895, thereby limiting the evader to a maximum possible payoff (of 21.2907 if the latter is actually hiding in region 1) within 0.4% of V(P~^), and of course even closer to V(P*). The procedure outlined in Section 2(i) can be divided into two parts. The first described how one could find the payoff ViP) corresponding to any evader strategy P; the second sug- gested how by beginning with Pg and iterating one could converge towards P*. In discussing possible policies for the searcher, we have so far been assuming that he actually follows this procedure to the extent of determining a P^ near to P*, and then examines the values of Vi(P) at P^ and in its vicinity. However, a large number of iterations is often needed to reach P^. This immediately suggests a simplified approach open to the searcher. It is as follows: Start with P^ and, using the first part of the procedure of Section 2(i), find ViPg)- Next take a sample of vectors (A^say) evenly distributed about Pg, and for each of these find V(P). From this sample select a P for which the components V,(P) vary as little as possible. We have V(P) < V(P*) < max V,(P). Clearly if max Vi(P) — V(P) is small, the searcher may feel justified in playing a good strategy vis a vis P. Otherwise, of course, he can proceed to the evaluation of P^, and then derive his strategy as described above. REFERENCES [1] Black, W.L., "Discrete Sequential Search," Information and Control, Vol 8, pp 159-162 (1965). [2] Bram, J., "A 2-Player jV-Region Search Game," IRM-31, Operations Evaluation Group, Center for Naval Analysis, Washington (Jan 1963). [3] Gittins, J.C, "Optimal Resource Allocation in Chemical Research," Advances in Applied Probability, Vol i, pp 238-270 (1969). [4] Norris, R.C., "Studies in Search For a Conscious Evader," MIT Lincoln Laboratory Techni- cal Report Number 279 (1962). [5] Roberts, D.M. and J.C. Gittins, "The Search for an Intelligent Evader: Strategies for Searcher and Evader in the two-region problem," Naval Research Logistics Quarterly, Vol 25, No. 1 (Mar 1978). [6] Stone, L.D., Theory of Optimal Search (Academic Press, New York and London; 1975). THE QUEUEING SYSTEM mVg/1 AND ITS RAMIFICATIONS M. L. Chaudhry Royal Military College of Canada Kingston, Ontario, Canada ABSTRACT This paper deals with the bulk arrival queueing system M /G/1 and its ramifications. In the system M /G/1, customers arrive in groups of size X (a random variable) by a Poisson process, the service times distribution is general, and there is a single server. Although some results for this queueing system have appeared in various books, no unified account of these, as is being presented here, appears to have been reported so far. The chief objectives of the paper are (i) to unify by an elegant procedure the relationships between the p.g.f.'s oo oo P(z) = ^ P„z"and P*{z) = £ P+z" «=0 n=0 where P„ and P„ are the limiting probabilities of queue lengths being n at ran- dom and departure epochs respectively, (ii) to correct an error in the paper by Krakowski and generalize his results and (iii) to discuss some other interesting y cases of the system M /G/1 and its special cases. INTRODUCTION Several authors have discussed one aspect or the other of the queueing M^/G/1 in which groups of random size, X, following a Poisson process join the system and are served individu- ally by a single server whose service time distribution is general, e.g., Gaver [6], using renewal theoretic arguments, discusses, among other things, P{z), the probability generating function (p.g.f.) of the limiting distribution of the number in the system at a random point in time; Sah- bazov [14], using the imbedded Markov chain technique discusses P'^iz), the p.g.f. of the number in the system at a departure epoch and the waiting time distribution for a random cus- tomer of an arrival group. The waiting time distribution discussed by Sahbazov [14] and some other authors seems to have been incorrectly reported; the correct formulation of which has recently been given by Burke [1]. It may be pointed out here that it is possible to derive P'^iz) either from the paper of Harris [9], some results of which are reported in section 5.1.10 of Gross and Harris [7] or from the works of some other authors. In queueing problems, among other things, the main interest, in general, centers around getting P„. However to simplify the analysis, Kendall, in his two important papers, proposed the use of imbedded Markov chains and the researchers then got P^ or other sijch probabilities by appropriately defining the regeneration points. Later, the other researchers in order to get P„ first got P^ and then related P^ to P„. However, one can easily get P„ directly in many sin- gle server queueing systems with bulk arrival or bulk service or both, and if need be P,^ from 667 668 ML. CHAUDHRY P„. It is towards this purpose that this paper is chiefly addressed. To do this, we consider the queueing system MVG/1. The procedure has been successfully employed by Chaudhry and Templeton [2] in getting similar results for the bulk service queueing system M/G^/1 wherein the server has a maximum capacity B. Gross and Harris [7] relate the probabilities P„ to different types of imbedded Markov chain probabilities tt„ using the semi-Markov approach. Their probabilities 7r„ are different from the probabilities P;^ (defined more accurately later) in that 7r„'s are considered at depar- ture and arrival epochs when the system is empty, whereas P^'s are considered only at depar- ture epochs. Thus, clearly the set of epochs of the imbedded chain considered in this paper is a subset of the set of epochs of the imbedded chain considered by Gross and Harris [7]. Although the probabilities /*„ are given in Gross and Harris, the technique of connecting P„'s and Pi^'s is new and elegant in that one does not have to first obtain 7r„ by the imbedded chain as defined in Gross and Harris and then use the theory, of semi-Markov processes or renewal theoretic arguments to get /*„— a normal practice so far. The method discussed here is new in that we reverse the procedure of first getting P^ (or tt„) and then P„, i.e., we first get P(z) and then the relation between the p.g.f.'s of P„ and P^ follows immediately and consequently the relations between various moments of the underlying distributions. Such relations between the moments neither appear to have been explicitly discussed in the literature nor would they be easily obtainable even if one tried to use the relation given in Gross and Harris [7] the exception to this statement being a recent result reported in Krakowski [11], the details of which are given in the next paragraph. It should, perhaps, be emphasized here that though the technique of supplementary variable is standard, its full impact is not yet known as will be revealed through the results that are being presented in this paper. Besides, the technique is more powerful than the other techniques discussed so far, because by using the procedure dis- cussed in this paper one can, as mentioned earlier, even get similar results for several other sin- gle server queueing systems with bulk service or possibly bulk arrival and bulk service in which service time distribution is general. Obtaining results, through other techniques, similar to the ones that are being presented here for the latter type of queueing systems would be much more difficult, if not impossible, than through the procedure discussed here. In a recent paper Krakowski [11] finds the average queue size at three epochs of time- random, just before arrival, and just after departure, for several queueing systems. However, for the system MVG/1, he obtains using intuitive arguments, the above averages only at an arrival epoch or at a random epoch (see scholion to his theorem B or section 4). Unfor- tunately, his result (S.23) has been incorrectly reported. Krakowski's [11] equation (S.23) is correct, if he was considering groups of constant sizes, but it does not appear to be so. By using mathematically more sound arguments, we later give correct expression for Krakowski's result (S.23). In fact, by our method, one can not only find the relations between averages, but also relations between higher order moments for the underlying distributions. It may be appropriate here to mention the other related works which have been discussed in the literature. Foster and Perera [5], using renewal theoretic arguments, have discussed rela- tions between steady-state probability generating functions (p.g.f.'s) of queue size considered at the above three epochs of time for the system GI7M/1 wherein customers following a recurrent process arrive in batches of fixed size r, and are served by a single server whose ser- vice time density is b{x) such that b{x) = tie-^\ X > 0, /A > 0. Foster [4] also establishes heuristically some relations for the more general system GI7G/1. However, no such relations have systematically been reported for the system M^/G/1 in which QUEUEING SYSTEM M^/G/1 669 the size of arrival groups is random— an assumption that would be better in many situations then the one when the size of arrival groups is constant. In this paper, we carry out an analysis similar to the one carried out by Chaudhry and Templeton [2], for the queueing system M-^/G/l wherein groups of random size, X, following a Poisson process join the system and are served individually by a single server whose service time distribution is general. In this way, we can, in principle, not only obtain relations between all the moments of queue size (for the queueing system MVG/1) at the three epochs of times under consideration, but also discuss some other interesting properties of MVG/1 and unify the results of many authors. As Foster did, so shall we use the term 'queue size' in all the three cases under discussion. The symbol Ej^ will indicate a modified Erlangian distribution with p.d.f. £ cMtit)'-^/ir - \)\}ti exp(-fit). Ej, which can be obtained from E/^ by putting c,. = 8,/,, where 8^^;. is a Kronecker symbol, will then be the usual Erlang (gamma) distribution which is the convolution of k exponential distri- butions with means 1//a. THE SYSTEM M^/G/1 P(z) Let Nit) be a random variable (r.v.) representing queue size at time /. Let groups of cus- tomers arrive at epochs = o-q, o-j, . . . , a'„, ... with their size being a r.v., X, such that oo oo P(X = x) = Ox, ^ Ox '^ ^ and a = X ^^x < °°- The arrival epochs follow Poisson (random) distribution with mean 1/X. The service times of individual customers are independently distributed identical r.v.'s with common density bi\) such that \/fji = I vbiv)d\ < °°. Let a-,, o-i, ... , (t„, ... be the epochs of departures of customers from the system. Let us now define the following probabilities: (i) Pj = lim PiNit) = j) This means Pj is the limiting probability (as r — ► oo) of j in the system at a random epoch of time. (ii) P- = lim PiNi(T'„-0) = j) This means P~ is the limiting probability (as « — ► oo) of j in the system just before an arrival epoch. (iii) /»/ = lim PiN((T„ + 0) = j) n—'co This means P/ is the limiting probability (as « — » oo) of J in the system just after a departure epoch. 670 ML. CHAUDHRY Assuming that the various Hmiting probabilities exist which they do when p = \al^ < 1, it will be established that ^^^ P-{z) = Piz) = [ail - z)/[\ - A (z)]}P+(z), where are the p.g.f.'s. Piz) = Y, Pj^'' etc. The first equation of (1) is easily established because of the randomness of arrivals. To prove the second equation requires a bit of more rigorous argument. We first discuss P{z) and then its relation to f^Cz). To get P{z), we introduce the following notation. Let (a) T7(x) be the conditional service rate so that the service time density and distribution functions are, respectively, given by b{x) = y){x) exp | - J^ -qit) dt Bix) = 1 -exp [- J^\U) dt^ (b) P„(x. t) = lim [P[N{t) = n, X < X{t) > x + dx]|^x] Ax— where X{t) is the elapsed service time of the customer undergoing service at time t. (c) PqU) = P[N{t) ^Q] Now to find the p.g.f. Piz) of Nit) in the limiting case, we proceed as in Cox [2]. Since the arguments and steps in deriving the steady-state equations are the same, we only give the partial differential equations in the limiting case as r — » ■» with the notation lim P„ix, t) = P„ix), lim P^it) = Pq; (2) Q = -\Pq+ C Pyix)r)ix)dx, ♦'0 bP ix) " (3) —i = -ik+ 7iix))P„ix) + £ a,„P„_Jx), n > 1 O^ m = l which are to be solved under the so-called boundary condition (4) /'.(O) = ^^ P„^,ix)y]ix)dx + \a„P^, n ^ 1 and the normalizing condition oo _ oo (5) ^0+ Z Jo PM)dx = \. n = l Define the p.g.f.'s (6) P^iz\ x) = X nU)z", Aiz)= £ a,„z"' n = \ m = \ Once again applying Cox's procedure to equations from (2) and (3) and using (4) and (6), we get the following results: ^^^ Poiz\ x) = Poiz; 0)(1 - Bix)) exp[\iA (z) - l)x] QUEUEING SYSTEM M-^/G/l 671 and consequently /'o(z) = /q Pq{z\ x) dx = zPQ[b{\ - \A (z)) - l}/[z - b{\ - \A (z))], where ^(a) = f" e-"'b{x) dx, and Po(z; 0) = \zPoU (Z) - l]/[z - bik - XA (z))], which can be obtained by using (4) and (7). Finally, ^^^ P(z) = Poiz) + Po = Po(^ - 2)/[l - [z/bi\ - \A (z))}], where Po = 1 - p. The result given in (8) for the case when G = E/^ or Ei^ has been independently discussed by Gupta [8] and Restrepo [13] respectively. Gaver [6] obtained (8) by using renewal theoretic arguments. We can, as well, obtain (8) if we identify a group with a single customer so that its (group's) service time distribution is just the total service time of its members constituting the group and then use the results of a single server system M/G/1. However, as this has been pointed out earlier that the present approach is different, and it not only immediately leads to the result for P'^iz), but also unifies all the results reported so far, besides correcting an error in Krakowski's [11] results. P+(z) To get P'^(z) or relate P(z) to P'^iz), we first find P'^(z) and then the relation is easily established. To get P'^iz), we have J.oo ^ P, + ,{x)T^ix) dx, where D is a normalizing constant. The p.g.f. P"^(z) is, then, given by P^(z) = £ p;z" J, CO CO I z''P„+,(x)t,(x) dx n = \ = iD/z)Poiz- 0) C 6(x)e-^"-^<--"^ dx '0 ^^^ = KDPoiA (z) - l)/[{z/MX - kA (z))} - 1], where we have used P(z; 0). Using the normalizing condition and the value of f(z), we get P+(z) = ^~ ^^'^ p(z) (1 -z)a as stated in (1). If a, = 8,i, then P+(z) = P(z), a result first established by Khintchine [10] and later by other authors. It can be shown that the result (9) agrees with the one obtained by Sahbazov [13] using the imbedded Markov chain procedure. Now we wish to discuss other interesting features of the system under consideration. 672 ML. CHAUDHRY I. The system M^/M/1. This may be obtained from MVG/1 by putting G = E so that if 7)(x) = /A, then (8) gives (since bia) = /i/(/u, + a)) ^^^^ Piz) = (1 -p)(l -z)fjL/[,jL-z{,x+k-kA(z)}], which is Luchak's [12] result for M/Ej^-/!. It is thus interesting to see that the system MVM/1 and M/E;j'/1 are equivalent, not only when P(X = r) = \ (see, e.g., later part of section 4.3.1 of Gross and Harris [7]), but also when A' is a r.v. The equivalence of the more general sys- tems GI7M/1 and GI/Eyi wherein r is fixed has been considered by several authors, see, for example Foster [4]. It is possible, in principle, to show that GIVM/1 and GI/Ej^^/l are equivalent, but the analysis would be a bit more cumbersome. II. From the relation (1), one can easily see that The result (11) is new and exhibits an interesting phenomenon. It states that an observer is more likely to find the system empty (for a > 1) than a departing customer leaves it. Its accu- racy can easily be checked. For when a = 1, (11) reduces to the known relation Pq = Pq for the single server queueing system M/G/1. III. Another interesting case which is connected with the case II or equation (11) is the rela- tion between the imbedded Markov chain probabilities Pq and ttq. From Gross and Harris [7] or Harris [9], one can find that (12) ^0 = ^ ~ P = — ^ ^^, \ - p + a Po + a /*o + 1 where the last result has been obtained by using the equation (11). IV. The various moments of queue size may be obtained from (1). In particular, if L~, L, L^ denote the expected queue sizes at the three epochs of time— just before arrival, random and just after departure, then one can easily see from (1) that the following relations must be satisfied. (13) L- = L = L"^ ; 2 where one may obtain L from (8) and is given by . pVV|+i) o-J + (a)2-fl U'=J; L = p -\ — r h p 2(1 -p) 2(1 -p)a where crj = variance of the service time distribution, and (tI = variance of the group size distribution. One can see from (14) that the average number in the queue, L^, is given by (15) L.^L-p^^^^+p '^'^^'^'-' ' 1-p 2(l-p)a where l/xR = {p^a-^ + 1) and R is the same as defined by Krakowski [11]. L^ is now the more general and correct form of Krakowski's [11] result (S.23) which is true only when the group size is constant rather than a random variable. Once L^ is known, one can obtain W^ f t QUEUEING SYSTEM M'^/G/1 673 from Little's formula, L, = kalV^. Equation (13) shows that an observer (for a > 1) is more likely to see a shorter expected queue size than a departing customer leaves it, which is con- sistent with the remark made in case II. V. If one is interested in the distribution of the number in the queue, it may be obtained from P(z). For, if one defines, in the case when / — oo, A^^ as a r.v. for the number in the queue, then as ^ = 1 TV - 1. N ^ 2 . N € I P^iz) = £[z^1 = Poiz - l)/[z - biX - XAiz)]. This has an interesting interpretation. It shows that the p.g.f. of the number in the system at a random epoch, for the bulk arrival system M^G/l, is equal to the p.g.f. of the number in the queue times the p.g.f. of the number that arrive during the service time of a customer. Such an interpretation for the sytem M/G/1 where-in arrivals are by singlets is well known. VI. An interesting result which falls outside the preceding results is the expected busy-period of the server. One way to find the expected busy-period of the server is to first find the distri- bution of busy-period, and then from it the expected value. However, its derivation by using an alternating renewal process is elegant. It is this approach that we adopt here. Since idle- periods and busy-periods generate an alternating renewal process, we have from the theory of renewal processes, E{X)/Em =p/(l - p), where EiX) and E(Y) are the expected busy and idle periods respectively. Now, since in MVG/1 by using the forgetfulness property of the exponential, E(Y) = \/\, EiX) =n/{fji-ka) which reduces to the well-known result for the queueing system M/G/1 if we take a, = 8,1. ACKNOWLEDGMENT The research for this paper was supported (in part) by the Defense Research Board of Canada, Grant Number 3610-603. The author is extremely grateful to a referee for pointing out the relation (12) and a few other useful recommendations. REFERENCES [1] Burke, P.J., "Delays in Single-Server Queues with Batch Input," Operations Research 23, 830-833, 1975. [2] Chaudhry, M.L. and J.G.C. Templeton, "The Queueing System M/G*/l and its Ramifications," Under submission. [3] Cox, D.R., "The Analysis of Non-Markovian Stochastic Processes by the Inclusion of Sup- plementary Variables," Proceedings of the Cambridge Philosophical Society 51, 433- ! 441, 1955. [4] Foster, F.G., "Batched Queueing Processes," Operations Research 12, 441-449, 1964. [5] Foster, F.G. and A.G.A.D. Perera, "Queues with Batch Arrivals II," Acta Mathematica Academiae Scientiarum Hungaricae, 16, 275-287, 1965. 674 ML. CHAUDHRY [6] Gaver, D.P., "Imbedded Markov Chain Analysis of a Waiting Line Process in Continuous Time, Annals of Mathematical Statistics 30, 698-720, 1959. [7] Gross, D. and CM. Harris, Fundamentals of queueing theory (John Wiley and sons, 1974). [8] Gupta, S.K., "Queues with Batch Poisson Arrivals and a General Class of Service Time Distributions," Journal of Industrial Engineering 15, 319-320, 1964. [9] Harris, CM., "Some Results for Bulk- Arrival Queues with State Dependent Service Times," Management Science 16, 313-326, 1970. [10] Khintchine, A., "Mathematical Theory of a Stationary Queue," Matemateceskii Sbornik, 39, 73-84 (Russian), 1932. [11] Krakowski, M., "Arrival and Departure Processes in Queues," Revue Francaise Automa- tique Informatique et Recherche Operationelle V-1, 45-56, 1974. [12] Luchak, G., "The Continuous Time Solution of the Equations of the Single Channel Queue with a General Class of Service Time Distributions by the Method of Generat- ing Functions," Journal of Royal Statistics Society Service B20, 176-181, 1958. [13] Restrepo, R.A., "A Queue with Simultaneous Arrivals and Erlang Service Distribution," Operations Research 13, 375-381, 1965. [14] Sahbazov, A. A., "A Problem of Service with Non-Ordinary Demand Flow," Soviet Mathematics Doklady 3, 1000-1003, 1962. Mil ON THE MOMENTS OF GAMMA ORDER STATISTICS p. C. Joshi Department of Mathematics Indian Institute of Technology Kanpur, India ABSTRACT A recurrence relation between the moments of order statistics from the gamma distribution having an integer parameter r is obtained. It is shown that if the negative moments of orders —(r~\) —1 of the smallest order statistic in random samples of size n are known, then one can obtain all the moments. Tables of negative moments for r = 2 (1) 5 are also given. 1. INTRODUCTION Let Z be a gamma random variable with probability density function (0 /(x) = e-^'x'-^lYir), x > 0, where r > 0. Let X^, Xj, ■■■ , ^„ be a random sample from (1), and Xx,„ ^ ^2.«^ ••• ^^n.n be the corresponding order statistics. Denote the / th moment of X^^ by a^'l. An expression for a^'l is given by Gupta [4] when r is an integer, and by Krishnaiah and Rizvi [6] for a general value of r. Tables of moments for selected values of n, k, r and / are given by Breiter and Krishnaiah [2] and Gupta [4]. The gamma order statistics and their moments are of great use in the analysis of life testing data, especially for r = 1 when the gamma distribution reduces to the exponential distribution. Some applications of gamma order statistics are discussed in Gupta [4] and Young [8]. The moments of order statistics are known to satisfy some recurrence relation, for exam- ple, see David ([3], pp. 36-38). In particular (2) j=n-k+\ n-k (_l)/-«+/c-l ^(0 Thus the moments of Xi^„ can be obtained as a linear combination of moments of smal- lest order statistics in random samples of n — k + \ , . . . , n. In this paper, we derive another type of recurrence relation when r is an integer. In fact we show that higher order moments of Xi(„ can be obtained from those of the lower order. Recurrence relations of this type for specific distributions are given by Barnett [1] for Cauchy distribution, by the author [5] for the exponential and truncated exponential distributions, and by Shah [7] for logistic distribution. In addition we provide a table of negative moments aj^'l for r = 2 (1) 5 and /= — (/• — 1), -1. 675 676 PC. JOSHI 2. THE NEGATIVE MOMENTS For the gamma random variable A' with density given by (1), the / th moment E(X') =rir + i)/rir) exists for all / > —r. Consequently, the / th moment of Xi^j, also exists for / > —r (David [3], pp. 25-26). When ris a positive integer, then x' {Fix)V-^{\-F(x)}''-'' fix) dx, {k - D! {n - k)\ where Fix) is the cumulative distribution function of Z given by Fix) = 1 r-\ 7=0 x'lj\ X > 0. Gupta [4] has shown that this can be written as n\ k-\ (k - D! in - k)\ r(r) I (-1)' p=0 k-\ P (r-\)(n-k+p) 5^ a,„(r, n - k + p) r(r + i + m) m=0 in - k + p + \) r + i+m where a^ir, p) is the coefficient of /"" in the expansion of . For r = 1 (1) 5, he has [.1=0 used this relation for tabulating the first four moments of Xi^^ for \ ^ k ^ n ^ \0, and of A'l „ for rt = 11 (1) 15. In Table 1, we extend his tables to negative moments al'l for r = 2 (1) 5, /=-(/•- 1). .... -1 and 1 < A: ^ « ^ 10, and a/'i for n = \\ (1) 25 and same values of r and /. These were evaluated to eight significant digits and are correct to the five decimal places as tabulated. For « ^ 10, they were also checked by using the identity k = \ 3. THE RECURRENCE RELATION In [5], the author has shown that for the exponential distribution (r = 1) a('> = ai'2i,„_, + (//«)«('-", i = \, 2. ...-l ^ k ^ rt. where we follow the conventions <3) a^o„> = l. l^k^n, (4) aJV = 0. / = 1, 2, ... ; r = 0, 1, 2, .. This recurrence relation was then extended to the right truncated exponential distribution. We now generalize this result in another direction to the gamma distribution and show that for integral values of r r-\ (5) /=0 MOMENTS OF GAMMA ORDER STATISTICS 677 for / = 1, 2, . . . , 1 < /c < «, where the conventions given at (3) and (4) are followed. To this end, let hix) = - /■- 1 I 7=0 k-l e "x^lj 7=0 n-k + \ then h'{x) = 1 7-0 ~''xJlj\ k-2 r-l I' 7=0 r^xVy! n- -k n r-\ 7=0 e-'^xJ/jl - (A:-l) t e-^x^-1 (r-D! and „ (') _ „ (') _ (^k,n ^k-\.n-\ — n-\ k-\ r x'/?'(x) dx. •Jo Integrating by parts, by treating x' for differentiation and /?'(x) for integration, we have r°° / x'-^ h(x) dx «A:,n <^k-\.n-\ — n-\ k-\ n-\ k-\ J.oo x'-i r-l 7=0 e-'xJlj\ k-\ le-'xVy! 7=0 n-k r-l £ e-^x'//! r-0 dx. Taking the sum over / outside the integral sign, multiplying and dividing by r(r), and integrat- ing term by term we get «|'i - oci'\„_, = Uln) r(r) £ ai[:'-^'lt\, f=0 which proves the results. Relation (5) expresses the / th order moment of X,^„ in terms of / th order moment of A'yt-i.rj-i and lower order moments of Xk_„. In particular, it gives the mean of Xx „ in terms of moments of orders —{r — 1), .... —1 of A'l „, the second moment of Zj „ in terms of moments of orders — (r — 2), .... —1, 0, 1 of X\n, etc. Taken together with relation (2), it shows that if the negative moments of orders — (r — 1), . . . , — 1 of the smallest order statistic in samples of size j ^ n are known, then one can calculate all the moments a^'l for / = 1, 2, ... and \ ^ k ^ n. It should be noted that only non-negative terms are added for the evaluation of al'l in equation (5). Consequently, the rounding errors are negligible for small values of r. This is illustrated in the following example. EXAMPLE : r = 2 and ^ = 1. In this case equation (5) reduces to (6) af'> = (i/n) (a/,'-2> + a/'"')), / = 1,2 Thus for n = 10, say, we have from Table 1, a/~io = 3.66022. Equation (6), then gives a,% = 0.46602, a,% = 0.29320, «!% = 0.22777, a,% = 0.20839 etc. These values agree perfectly with the values evaluated directly (see also Gupta [4]). 678 PC. JOSHI For all values of n, r and k, the moments of order /, 1 ^ / ^ 4, obtained by direct evaluation and by recurrence relation (5) agree up to eight significant digits, the digits up to which the calculations were performed and on which Table 1 is based. TABLE 1. Table of Negative Moments EiXj^„) of Gamma Order Statistics forr = 2 (1) 5. n r 2 3 4 5 -1 -2 -1 -3 -2 -1 -4 -3 -2 -1 1 1 1 .00000 0.50000 0.50000 0.16667 0.16667 0.33333 0.04167 0.04167 0.08333 0.25000 2 1 1.50000 0.87500 0.68750 0.31250 0.27083 0.43750 0.08073 0.07422 0.12891 0.31836 2 2 0.50000 0.12500 0.31250 0.02083 0.06250 0.22917 0.00260 0.00911 0.03776 0.18164 3 1 1.88889 1.20370 0.82099 0.44833 0.35566 0.50810 0.11832 0.10295 0.16424 0.36321 3 2 0.72222 0.21759 0.42052 0.04084 0.10118 0.29629 0.00555 0.01676 0.05824 0.22866 3 3 0.38889 0.07870 0.25849 0.01083 0.04316 0.19560 0.00113 0.00529 0.02752 0.15813 4 1 2.21875 1.50415 0.92792 0.57746 0.42950 0.56292 0.15485 0.12928 0.19403 0.39733 4 2 0.89931 0.30236 0.50020 0.06093 0.13415 0.34365 0.00872 0.02397 0.07488 0.26085 4 3 0.54514 0.13282 0.34085 0.02074 0.06820 0.24894 0.00238 0.00954 0.04159 0.19646 4 4 0.33681 0.06066 0.23103 0.00753 0.03482 0.17783 0.00071 0.00388 0.02283 0.14536 5 1 2.51040 1.78463 1.01852 0.70158 0.49594 0.60831 0.19056 0.15387 0.22020 0.42517 5 2 1.05215 0.38222 0.56550 0.08102 0.16375 0.38135 0.01202 0.03089 0.08935 0.28598 5 3 0.67004 0.18258 0.40225 0.03081 0.08975 0.28710 0.00375 0.01360 0.05317 0.22314 5 4 0.46187 0.09965 0.29992 0.01403 0.05383 0.22349 0.00147 0.00683 0.03387 0.17868 5 5 0.30554 0.05092 0.21381 0.00590 0.03006 0.16641 0.00053 0.00314 0.02007 0.13703 6 1 2.77469 2.04988 1.09789 0.82168 0.55693 0.64736 0.22559 0.17713 0.24377 0.44883 6 2 1.18894 0.45840 0.62168 0.10103 0.19097 0.41309 0.01543 0.03757 0.10235 0.30684 6 3 0.77856 0.22986 0.45312 0.04100 0.10931 0.31788 0.00521 0.01754 0.06337 0.24427 6 4 0.56151 0.13531 0.35138 0.02061 0.07019 0.25633 0.00229 0.00965 0.04297 0.20202 6 5 0.41205 0.08182 0.27419 0.01074 0.04566 0.20707 0.00106 0.00542 0.02932 0.16701 6 6 0.28424 0.04474 20174 0.00493 0.02694 0.15828 0.00042 0.00269 0.01822 0.13103 1 3.01814 2.30292 1.16897 0.93848 0.61369 0.68180 0.26003 0.19932 0.26535 0.46951 2 1.31401 0.53162 0.67144 0.12093 0.21637 0.44071 0.01891 0.04403 0.11424 0.32479 3 0.87628 0.27533 0.49728 0.05127 0.12748 0.34404 0.00674 0.02140 0.07262 0.26198 4 0.64827 0.16924 0.39424 0.02730 0.08509 0.28299 0.00318 0.01240 0.05103 0.22065 5 0.49645 0.10986 0.31923 0.01560 0.05901 0.23633 0.00163 0.00758 0.03693 0.18805 6 0.37829 0.07060 0.25617 0.00880 0.04032 0.19537 0.00083 0.00455 0.02628 0.15859 7 0.26856 0.04043 19266 0.00429 0.02471 0.15210 0.00035 0.00238 0.01688 0.12644 8 1 3.24502 2.54586 1.23362 1.05244 0.66703 0.71273 029397 0.22060 0.28537 0.48792 8 2 1.42998 0.60239 0.71637 0.14071 0.24031 0.46528 0.02244 0.05032 0.12526 0.34060 8 3 0.96608 0.31934 0.53667 0.06159 0.14457 0.36699 0.00832 0.02517 0.08116 0.27735 8 4 0.72662 0.20197 0.43165 0.03407 0.09900 0.30580 0.00411 0.01511 0.05839 0.23637 8 5 0.56992 0.13650 0.35684 0.02053 0.07118 0.26018 0.00225 0.00970 0.04368 0.20492 8 6 0.45236 0.09387 0.29666 0.01265 0.05170 0.22203 0.00126 0.00632 0.03288 0.17793 8 7 0.35360 0.06285 0.24267 0.00752 0.03653 0.18648 0.00068 0.00397 0.02408 0.15214 8 8 0.25641 0.03723 0.18552 0.00382 0.02303 0.14718 0.00031 0.00215 0.01585 0.12277 9 1 3.45832 2.78021 1.29314 1.16395 0.71753 0.74089 0.32747 0.24112 0.30409 0.50457 9 2 1.53864 0.67103 0.75749 0.16036 0.26303 0.48749 0.02601 0.05645 0.13558 0.35478 9 3 1.04970 0.36212 0.57242 0.07194 0.16078 0.38753 0.00994 0.02887 0.08914 0.29097 9 4 0.79884 0.23377 0.46516 0.04090 0.11214 0.32591 0.00507 0.01777 0.06521 0.25009 9 5 0.63636 0.16223 0.38976 0.02553 0.08257 0.28066 0.00290 0.01178 004986 0.21923 9 6 0.51677 0.11592 0.33050 0.01653 0.06207 0.24379 0.00173 0.00803 0.03874 0.19347 9 7 0.42016 0.08284 0.27974 0.01070 0.04651 0.21114 0.00103 0.00546 0.02995 0.17016 9 8 0.33459 0.05714 0.23208 0.00661 0.03367 0.17943 0.00058 0.00354 0.02241 0.14699 9 9 0.24664 0.03474 0.17970 0.00348 0.02170 0.14315 0.00027 0.00197 0.01503 0.11974 MOMENTS OF GAMMA ORDER STATISTICS 679 TABLE 1 (Continued). Table of Negative Moments E(Xl^ „) of Gamma Order Statistics forr = 2 (1) 5. \^ r n k^\ 2 3 4 5 -1 -2 -1 -3 -2 -1 -4 -3 -2 -1 10 1 3.66022 3,00714 1,34843 1,27330 0,76562 0.76678 0,36057 0.26097 0.32173 0,51978 10 2 1.64122 0.73783 0,79554 0,17988 0,28472 0.50782 0,02961 0.06243 0.14532 0,36767 10 3 1.12832 0,40384 0,60530 0,08229 0,17626 0.40619 0,01160 0.03251 0,09665 0.30325 10 4 0.86626 0.26478 0,49570 0,04778 0,12466 0,34399 0,00607 0.02039 0,07161 0.26232 10 5 0.69769 0.18726 0,41935 0,03058 0,09336 0,29879 0,00357 0,01383 0,05561 0,23176 10 6 0.57502 0,13720 0,36018 0,02047 0,07178 0,26254 0,00222 0,00973 0.04411 0,20670 10 7 0.47793 0,10173 0,31072 0,01391 0,05560 0,23130 0,00140 0,00691 0,03516 0,18466 10 8 0.39541 0,07475 0,26646 0,00933 0,04262 0.20250 0,00087 0,00484 0,02772 0,16394 10 9 0.31938 0,05273 0,22349 0,00593 0,03144 0,17367 0,00051 0,00322 0,02108 0,14275 10 10 0.23856 0,03274 0,17484 0,00320 0,02061 0,13976 0,00024 0,00184 0,01436 0,11718 11 1 3,85237 3,22756 1,40017 1,38070 0,81163 0,79080 0.39330 0,28024 0,33845 0,53381 12 1 4.03607 3,44218 1,44888 1.48635 0,85582 0,81323 0,42570 0,29899 0,35437 0,54684 13 1 4,21235 3.65161 1,49497 1.59041 0,89840 0,83430 0,45779 0.31727 0,36958 0,55902 14 1 4.38203 3,85633 1,53876 1.69301 0,93953 0.85418 0.48961 0.33512 0,38418 0,57047 15 1 4.54581 4,05677 1,58052 1,79426 0,97938 0,87302 0.52116 0.35258 0,39822 0,58128 16 1 4.70426 4,25327 1.62047 1.89425 1,01804 0.89094 0,55246 0.36969 0,41175 0,59152 17 1 4.85787 4.44615 1,65880 1.99308 1,05563 0.90804 0,58353 0,38646 0,42484 0,60125 18 1 5,00706 4,63566 1,69566 2.09082 1,09224 0.92439 0,61438 0,40294 0,43751 0,61054 19 1 5.15220 4.82204 1,73118 2.18754 1,12794 0.94008 0,64503 0,41912 0,44980 0,61941 20 1 5.29358 5,00550 1.76548 2.28329 1,16279 0.95515 0,67548 0,43504 0,46173 0,62792 21 1 5.43150 5.18622 1.79865 2.37812 1.19686 0.96967 0,70574 0,45071 0,47335 0,63609 22 1 5.56620 5.36436 1.83079 2.47210 1.23021 0.98368 0,73583 0,46615 0.48467 0,64395 23 1 5.69788 5.54007 1.86198 2,56525 1.26287 0,99721 0,76574 0,48136 0.49570 0,65152 24 1 5.82675 5.71349 1.89227 2,65762 1.29488 1.01031 0.79550 0.49636 0,50647 0,65884 25 1 5.95298 5.88473 1.92174 2.74924 1.32630 1.02300 0,82510 0.51117 0,51700 0,66591 ACKNOWLEDGMENT The author wishes to thank the referee for some helpful suggestions in the preparation of '.his paper. REFERENCES il] Barnett, V.D., "Order Statistics Estimators of the Location of the Cauchy Distribution," Journal of American Statistical Association 61 1205-18 (1966). Correction 63, 383-5 (1968). '2] Breiter, M.C. and P.R. Krishnaiah, "Tables for the Moments of Gamma Order Statistics," SankhyaB 30, 59-72 (1968). :3] David, H.A., "Order Statistics," (Wiley: New York 1970). 4] Gupta, S.S., "Order Statistics from the Gamma Distribution," Technometrics 2, 243-62 I (1960). Is] Joshi, P.C., "Recurrence Relations Between Moments of Order Statistics from Exponential and Truncated Exponential Distributions," Sankhya B 39, 362-71 (1978). ;6] Krishnaiah, P.R. and M.H. Rizvi, "A Note on the Moments of Gamma Order Statistics," Technometrics 9, 315-8 (1967). 17] Shah, B.K., "Note on the Moments of a Logistic Order Statistics," Annals of Mathematical ' Statistics 41, 2150-2 (1910). I'S] Young, D.H., "Moment Relations for Order Statistics of the Standardized Gamma Distribu- 1 tion and the Inverse Multinomial Distribution," Biometrika 58, 637-40 (1971). A NEW STORAGE REDUCTION TECHNIQUE FOR THE SOLUTION OF THE GROUP PROBLEM Richard V. Helgason and Jeff L. Kennington Department of Operations Research and Engineering Management Southern Methodist University Dallas, Texas ABSTRACT This paper shows that by making use of an unusual property of the decision table associated with the dynamic programming solution to the goup problem, it is possible to dispense with table storage as such, and instead overlay values for both the objective and history functions. Furthermore, this storage reduc- tion is accomplished with no loss in computational efficiency. An algorithm is presented which makes use of this technique and incorporates various addi- tional efficiencies. The reduction in storage achieved for problems from the literature is shown. I. INTRODUCTION The group theoretic approach to integer programming was first presented by Ralph Gomory [4] in 1969. The basic theoretical results may be found in [3, 4, 7, 9, 10, 11] and computational experience with variations of the approach may be found in [2, 5, 6]. The first step is to solve the continuous relaxation of the integer program. If the solution is integer, the problem is solved. If not, one then uses the optimal linear programming basis to derive a relaxation of the integer program known as the group (knapsack) problem. Mathematically, the group problem may be assumed to take the following form: min Y ^i^i (=1 (1) s.t. Xs,-^, = ^(mod e) /=i X, a non-negative integer for all t, where g,, ... , g^, d, and e are known integer r-vectors and the c,'s are known non-negative scalars. The details for obtaining the group problem in the above form may be found in [3, 7, 9]. The integer vectors g, may be used to generate an abelian group under addition modulo e and hence the names group theoretic approach and group problerrt. The next step in the group theoretic approach is to solve (1). Gomory [4] presents a sim- ple dynamic programming algorithm for solving this problem. Dynamic programming (see [1, 8]) is a multi-stage solution procedure in which a recursive relation is used to compute columns f 681 682 R.V. HELGASON & J.L. KENNINGTON in a decision table. Each successive column represents a further stage in the optimization pro- cedure. A group problem with q columns will require a ^-stage optimization. At the conclusion of" the q^^ stage, the solution may be recovered by backtracking through the decision table. In this paper, we present an unusual property of this particular decision table which allows one to recover the optimal solution from the information associated with only the q^^ stage. Consequently, we can dispense with table storage as such for the dynamic programming tech- nique by overlaying all table values at successive stages. II. SOLVING GROUP PROBLEMS VIA DYNAMIC PROGRAMMING Consider the following two-parameter class of group minimization problems over the group G = {^0. ^i- ••• . ^m-i), i mm i = k s.t. ^Q,x, = gi (mod e) Xi, a non-negative integer for all / where the integers / and k are bounded by < /< w — 1 and \ ^ k ^ q. go = 0, the zero vector. Let fn^ denote the optimal objective value of Pn^ if a solution exists and let //^ = <», otherwise, /qo is taken to be while //o = <» for all / ^ 1. At the /c"^ stage of the dynamic pro- gramming solution procedure, one must find /^^^ for r = 0, 1, . . . , w — 1; that is, (1) is solved using only x,, ... , Xi^ and with all group elements as right-hand sides of the congruence. If /* is such that gi> = d, then the solution of P/.^ is also the solution of (1). A dynamic programming algorithm can be developed for P/^ by noting that the solutions to Pik can be partitioned into those with X/^ = and those with x^ ^ \. Let /'be defined such that gi' = gi — Qk- Then the above is equivalent to saying that for X/^ = 0, fn^ = //jt-i; and for Xk ^ li //Ar = t'A: + fi'k- Thus a recursive equation may be written as follows: fik = min [fi^k-\> Ck + frk}- To apply (2), one must be able to compute fik for all /. This is always possible if g^^ gen- erates the whole group. In the case where Qk does not generate G, the following procedure is used. For each coset, one chooses an element gi^ and sets f*k = fr,k-\- Then one generates elements of the coset successively using gi = g/^ -I- ag^, for a = 1, 2, . . . , and computes ftk = min {fi_k-\, Ck + fi'k) with /* determined by g/. = gi — Qk- The above procedure is cyclic, and should be terminated when all new ffk agree with those computed in the previous cycle. Then, fik is taken to be /*^. Termination is guaranteed within two cycles and occurs anytime during the second cycle when any f*k agrees with its pre- vious value. Obviously this procedure may also be applied when g^ generates the whole group since the coset of interest is G. Justification for the above procedure may be found in [3, 7, 9, STORAGE REDUCTION TECHNIQUE 683 10]. In order to recover the solution at the termination of stage q, one also carries a history function, which keeps track of the variable used in (3) to obtain the minimum. Note that in case //jt-i = Ck + ff'k, an arbitrary decision is possible. This gives rise to a number of possible realizations of the history function, corresponding to the combinations of alternate optima for each problem f/^. To facilitate the intended storage reduction, we define a particular history function as follows: (4) h„ = ^u-i. if//* = fi.k-\. and k, otherwise. Note that this records the most recent stage for which a strict decrease occurred in the objective value associated with gi. Thus, in case //,t-i = Q + //*•*;, one does not use the variable associ- ated with the new stage. A naive implementation of the above algorithm requires a decision table with mqit + p) bits where / is the number of bits required to carry the //^^'s and p must be such that 2'' '^ q + \ (i.e., with p bits we represent the numbers to q). This algorithm as described in [7, 9, 10] requires the full table size while the presentation in [3] assumes a table with mqit + 1) bits. However, all values of /<./ and //^ need not be saved since the recursions (2) and (3) can be executed with partial information about the current stage and partial information about the previous stage. Therefore, one may easily, implement the algorithm with a table of size mit + q) bits. Even so, the dynamic programming decision table may become quite large. We remark that with a table of size m{t + q) bits it is possible to recover all alternate optima. Implementation of the procedure may be enhanced if the group elements, G = [gQ, g\, .... ^„-i}, are ordered. Since < ^, < c for all /, there is a natural ordering of the group elements. For any element of G, say /3 = [/3i, . . . , /3j, we assign the order of /3, denoted /(/3), as follows: 1 = 2 k = i-\ n ^. k = \ This corresponds to array subscripting, using r subscripts with the first varying most rapidly. Using the above ordering, the recovery procedure is quite simple and is given below: 1. [Initialize Variables] x, ■^ 0, / = 1, . . . , q 2. [Start at d\ g *- d 3. [Start at stage q] k '— q 4. [Reference history] k *— h^^) ,^ 5. [Add to solution] x^^ *— x^ + 1 6. [Backtrack] g'— g — Qi, 7. [Done?] If ^ ?^ 0, go to 4; otherwise, terminate. Step 4 above implies that the history function for each group element, G = [gQ, ... , g,„-\}, and each stage, k = I, .... q, must be available for solution recovery. 684 R.V. HELGASON & J.L. KENNINGTON III. DYNAMIC PROGRAMMING ALGORITHM USING THE STORAGE REDUCTION TECHNIQUE In this section it is shown that h^g) j^ may be replaced by /j/(g),, in step 4 of the recovery procedure. Hence, only the q^^ stage history function need be available for solution recovery. Consider the following propositions. PROPOSITION 1: If hn, = j ^ k, then /?y = hij+^ = . . . = h,,, = j. The proof of proposi- tion 1 is obvious by the definition of the history function given in (4). PROPOSITION 2: For all integers / and k with < / < m - 1 and 1 < A: < 9, if j = h,^, then hf,, ^ 7 where gr = gi - Qj. PROOF: Choose / and k arbitrarily and let j = hi^. Let g^ be such that gr = gi — g,, and let j' = hf^. We must show that j' ^ J. Assume the contrary. Then by Proposition 1 and the definition of the history function, (5) (6) (7) (8) (9) (8) implies that frj > fry. From (6) we have that f,r = frj + Cj. Let [x] = a\, . . . , Xj = Uj, .... Xy' = uj] denote an optima for //y. Then [x, = ai, . . . , Xj = uj + \, . . . , Xj' = aj] is feasible for P/y since gj = gr + g, and has value fry + Cj. Since the objective value for an optima of fy must be less than or equal to the objective value for any feasible solution, frj + Cj < frj' + Cj. This implies frj < f-j- which contradicts (9). Therefore j' < j and the proposition is proved. We now use the above propositions to prove that /j/(g),, can replace /i/(g)j in step 4 of the recovery algorithm. PROPOSITION 3: For all integers / with ^ / ^ m - 1, if J = h,^, then hr^ = hrj where gr = gi - Qj. PROOF: Choose /arbitrarily and let j = /?,,. Let gr be such that gr = gi - Qj- By Propo- sition 2, hrq ^ / Then from Proposition 1 /?/■, = hrj. Since only the q^^ stage history function is required for solution recovery, we drop the | subscript associated with the stage for both the //^'s and h„^''s. The complete dynamic program- ming algorithm may then be stated as follows: ^//t = • . = hij' = . . . = hij = j with //. = .. ■= fir= ■■■= fij < fi.j-i; and hfk = • . . = hry ^ hry-\ with frk = . • • = //■/ < //',/-! < • • • =^ //v I 0V( STORAGE REDUCTION TECHNIQUE 685 REDUCED STORAGE D.P. ALGORITHM FOR THE GROUP PROBLEM L Initialize a. [Objective values] /o "" 0; /, <— oo, / = 1, . . . , m — \. b. [History values] h^ *— q -\- \\ i = Q, \, ... , m - \. c. [Stage counter] k ^ 0. 2. Begin New Stage a. [Increment Stage counter] k *— k + \. b. [Flag group elements not updated] h, * /i,, / = 0, 1, . . . , m - \. c. [Select first coset element] g *— d + Qi^ 3. Find a Coset Element Which May Lead to an Improved Solution a. [Save starting index] /* ^ l{g) b. [Test objective value] If //(^) < fad), go to 4. c. [Generate another coset element] g *— g + Qk- d. [Coset exhausted?] If l{g) ;^ /*, go to b. e. [Flag coset elements updated] h,(g) ^ \hi(g)\ for all g in the coset containing g,*; go to 5. 4. Apply Recursion to Coset a. [Starting value is previous value] v — fug). b. [Possible next value using stage k generator] v ♦— v + c^. c. [Flag element updated] h^g) "— |/7/(^)|. d. [Generate another coset element] g "— g + Qk- e. [Test for minimum] If fug) ^ v, go to h. f. [Decrease in value] f^g) «— v. g. [Update history] hug) *— k ; go to b. h. [Element previously updated?] If hi(g) > 0, go to 5; otherwise, go to a. 5. Test to Terminate, Go to Next Stage, or Update Another Coset a. [Last stage?] If k = q, recover solution and terminate. b. [Test for another coset to update] Let g be an element for which hug) < 0. If none, go to 2; otherwise go to 3. The above algorithm is essentially equivalent to the original procedure presented by i^Gomory [4], except that no arbitrary decisions were possible using (3) and all information is "overlaid in the decision table. Hence, our approach may save considerable core storage with no loss in efficiency. Furthermore, additional computational efficiencies have been incorporated as follows: 686 R.V. HELGASON & J.L. KENNINGTON (i) At any stage, if no element of a particular coset has objective value less that //(rf), the entire coset is not actually updated since any problem solution so derived cannot be part of an optimal solution to Pi((i).k- (ii) At each stage, the coset containing d is updated first. Thus at stage q we can ter- minate without updating any other cosets, and during earlier stages, a better value for //((/) is used in the above test. (iii) By using one additional bit per group element we flag cosets. considered more than once. Hence, no coset is ever In the algorithm presented, the flag bit is the sign of hi. This new procedure is implemented with m (t + p + \) bits, as compared to m(t + q) bits for an efficient implementation not using the overlay feature, where t denotes the number of bits required to store an // and p is selected such that 2^ > ^ + 2. Table 1 presents a com- parison of storage requirements on typical group problems taken from [6], with t taken as representative of the word size in bits for two classes of machines. The storage savings ranges from 12 to 85% with an average of approximately 50%. TABLE 1. Comparison of Table Size for Dynamic Programming Algorithm Table # Problem # Basis Determinant # Nonbasics 2'' > q + 2 IBM 360/370 (32 bit words) CDC 6000/7000 (60 bit words) Standard Table Reduced Table % Savings Standard Table Reduced Table % Savings (see 161) (see 161) m « P m(32 + q) m(32 + /) + 1) 111-121 m(60 + q) m(60 + p + 1) 131-141 (11 121 111 131 141 131 3a 5 24 240 8 6528 984 85 7200 1656 77 J 10 144 36 6 9792 5616 43 13824 9648 30 15 180 240 8 48960 7380 85 54000 12420 77 20 280 109 7 39480 11200 72 47320 19040 60 25 512 140 7 69632 20480 71 83968 34816 59 \ ' 30 1080 109 7 152280 43200 72 182520 73440 60 3a 35 2048 104 7 278528 81920 71 335872 139264 59 3b 2 48 195 8 10896 1969 82 12240 3312 73 ii 4 128 14 4 5888 4736 20 9472 8320 12 6 864 36 6 58752 33696 43 82944 57888 30 u g 5025 18 5 251250 190950 24 391950 331650 15 3b 10 6912 36 6 470016 269568 43 663552 463104 30 IV. SUMMARY We have shown that by using a history function which records only the most recent stage at which a strict decrease occurred in the objective value associated with each group element, it is possible to dispense with table storage as such, and overlay values both for the objective and history functions. We have shown that this storage reduction may be accomplished with no loss in computational efficiency and have incorporated this technique into a highly efficient algo- rithm. Our procedure does not allow for the recovery of alternate optima. However, by dynamically storing a partial table consisting of all occurrences of ties in (3) following a strict decrease of //^^ (^s given by Z?/^), alternate optima may be determined. REFERENCES [1] BeUman, R.E. and S.E. Dreyfus, Applied Dynamic Programming (Princeton University Press, Princeton, New Jersey, 1962). [2] Fisher, M.L., W.D. Northup and J.F. Shapiro, "Using Duality to Solve Discrete Optimiza- tion Problems: Theory and Computation Experience," Mathematical Programming Study i, 56-94 (1975). STORAGE REDUCTION TECHNIQUE 687 [3] Garfinkel, R.S. and G.L. Nemhauser, Integer Programming (John Wiley and Sons, New York, New York, 1972). [4] Gomory, R.E., "Some Polyhedra Related to Combinatorial Problems," Linear Algebra and Its Applications, 2, 451-558 (1969). [5] Gorry, G.A. and J.F. Shapiro, "An Adaptive Group Theoretic Algorithm for Integer Pro- gramming," Management Science, 17(5), 285-306 (1971). [6] Gorry, G.A., W.D. Northrup and J.F. Shapiro, "Computational Experience With a Group Theoretic Integer Programming Algorithm," Mathematical Programming, 4, 171-192 (1973). [7] Hu, T.C., Integer Programming and Network Flows (Addison- Wesley, Reading, Mass., 1969). [8] Nemhauser, G.L., Introduction to Dynamic Programming (John Wiley and Sons, New York, New York, 1966). [9] Salkin, H.M., Integer Programming (Addison-Wesley Publishing Company, Reading, Mass., 1975). [10] Taha, H.A., Integer Programming: Theory, Applications, and Computations (Academic Press, New York, New York, 1975). [11] Zionts, S., Linear and Integer Programming (Prentice-Hall, Inc., Englewood Cliffs, New Jer- sey, 1974). EXPERIMENTS WITH LINEAR FRACTIONAL PROBLEMS Gabriel R. Bitran Massachusetts Institute of Technology Cambridge, Massachusetts ABSTRACT In this paper we present the results of a limited number of experiments with linear fractional problems. Six solution procedures were tested and the results are expressed in the number of simplex-like pivots required to solve a sample of twenty problems randomly generated. Two main approaches emerge from the literature to solve the linear fractional problem: ^^^ V = max{/(x) = n(x)/d(x): x^F] where n(x) = Cg + ex, d(x) = d^ + dx, F = [x^R": Ax = b x "^0), Cg and dg are real numbers, c and d are real n-vectors, A is an mxn real matrix and 6 is a real m-vector. We assume in this note that Fis compact and that mm[d(x): x^F] > 0. Charnes and Cooper [4] transform problem (P) into the linear program: V = max[cgt + cy: Ay — bt = 0, dgt + dy = \, and t, y > 0}. This approach has been extended to the nonlinear versions of (P) by Bradley and Frey [3] and Schaible [8]. The second approach solves a sequence of linear problems or at least one pivot step of each linear program over the original feasible set by updating the objective function. Algorithms in this category are related to ideas first presented by Isbell and Marlow [5] and Martos [6]. Similar algorithms have been proposed by several other authors. The interested reader is referred to the excellent bibliography collected by I.M. Stancu-Minasian [9]. Methods in the second approach propose to solve (P) through a sequence of linear programs: r(x*) =max{r(x\ x) = n{x) - fix>^d(x): x€F}A: = 0, 1, 2 ... (LP,,) where x° is a given feasible point and x*^ for /c > 1 is defined in Isbell and Marlow's procedure as being the optimal solution to (LP^^-i) and as the first feasible basis in {LP^^x) for which r{x'^~^, x) > in Martos's procedure. Both algorithms terminate at iteration kg for which k k rix ") = 0. In this case x " = Xopumai- It is worth noting that Wagner and Yuan [10] related the two main approaches by showing that Martos's algorithm is equivalent to Charnes and Cooper's method in the sense that both algorithms lead to an identical sequence of pivoting operations. Bitran and Magnanti [1] have extended the connection between these approaches by relating them to generalized programming. No theoretical or empirical evidence has been given in the past indicating which of the several existing algorithms is preferred. 689 690 G.R. BITRAN In this note we present the resuhs, in number of simplex-like pivots, of twenty problems of type (P), randomly generated, solved by the following six algorithms (each problem when solved by each of the six procedures was started with the same basic feasible solution): A) Maximize n{x) over the feasible set obtaining the optimal solution x*. Next, apply Isbell and Marlow's algorithm with x° = x*. B) Minimize dix) over the feasible set obtaining the optimal solution x*. Next, apply Isbell and Marlow's algorithm with x° = x*. C) Maximize g(x) = [c — (cd/dd)d]x over F obtaining the optimal solution x*. Next, apply Isbell and Marlow's algorithm with x" = x* (Bitran and Novaes [2] suggested the objective function g(x)). D) Isbell and Marlow's algorithm. E) Martos's algorithm. F) The author considered it relevant to compare these algorithms with the number of pivots necessary to solve the linear programs: (LP) ma\{n(x) - vd(x): x^F} where for each of the twenty problems (P), v is chosen as its optimal value. The optimal value of (LP) is zero and any solution to (LP) is optimal in the fractional program (P) ([1]). Note that (LP) corresponds to (LP^^) with x'^ = Xopijmai- The characteristics of the data of the twenty randomly generated problems are the follow- ing: «=40, m=20, the absolute value of each a,^, the (/J)th element of each matrix A was randomly generated in the interval (0,10]. The density of negative elements being 20%. Each n component b, i = \, 2, . . . , w of each right hand side b was defined as ^ a,j/2. The objec- 7 = 1 tive function coefficients Cg, Cj, dg, djj = l, 2 .. n were generated in the intervals [—1000 < Cg, Cj < 1000; < dg, dj < 1], [1 < Cg, Cj < 1000; 1 < dg, dj < 2] or [-1000 < c„, Cj < — 1; 1 < dg, dj < 2]. The reason for choosing such intervals was to obtain five problems with an angle 9 between the gradients of the numerator and denominator, i.e., TT IT eos 9 - -^, in each of the four intervals llcll WdW «'f 4 ' 2 TT 377 2 ' 4 37T » m an attempt to identify a correlation between the algorithms tested and the geometry of linear frac- tional programs. The geometric properties of problem (P) are consequences of the following facts. i) The hyperplanes n{x) — Ld{x)=0 contain for each L both the sets [x^R": fix) = L) and CE = {xeR": n(x) = and dix) = 0}. The set CE is called the center of the problem because as L varies the hyperplanes rotate about it giving a "star" centered at CE ([2]). ii) The objective function fix) is pseudo-concave and quasiconvex on the set [x^R": dix) > 0}, i.e., fiy) > fix) if and only if V/(x) (>' - x) > 0. EXPERIMENTS WITH LINEAR FRACTIONAL PROBLEMS 691 In R^ the geometry of (P) ([2]) suggests that procedure (C) would perform better than (A) and (B) for high and low values of 0(0€[O,7r]). Table 1 shows the results obtained. For the first and last five problems a total of 178 pivots was necessary with procedure (C) while 233 and 363 pivots were required with procedures (A) and (B) respectively. The corresponding standard deviations being 3.70, 6.01 and 7.90. For the twenty problems selected Martos's algo- rithm performed better than the preceding four and in some cases required fewer pivots than procedure (F). Algorithms (C) and (D) were practically equivalent and were followed by (A), while (B) performed poorly. The computer code used to solve the twenty problems by the six algorithms was an adaptation of Burroughs's commercial code TEMPO. TABLE 1 Problem Number A B C D 1 E F cos 1 24 21 13 11 12 12 .873 2 21 34 18 18 15 15 .858 3 23 34 12 12 10 10 .819 4 19 39 21 21 18 17 .770 5 32 46 22 23 21 19 .730 Mean 23.8 34.8 17.2 17.0 15.2 14.6 Standard Deviation 4.44 8.18 4.07 4.77 3.97 3.26 6 22 32 20 21 15 15 .569 7 25 57 28 23 18 18 .500 8 19 16 16 16 15 15 .370 9 22 51 18 20 11 11 .132 10 19 39 18 26 20 16 .076 Mean 21.4 39.0 20.0 21.2 15.8 15.0 Standard Deviation 2.24 14.46 4.19 3.31 3.06 2.28 11 12 38 18 18 11 10 -.103 12 21 47 21 21 19 20 -.289 13 18 33 20 22 20 21 -.424 14 18 36 20 20 19 22 -.485 15 33 50 31 25 26 19 -.613 Mean 20.4 40.8 22.0 21.2 19.0 18.4 Standard Deviation 6.94 6.55 4.60 2.31 4.77 4.32 16 19 51 17 17 15 15 -.720 17 16 30 13 12 13 16 -.747 18 16 39 20 21 13 15 -.820 19 33 33 22 23 21 24 -.840 20 30 36 20 24 18 19 -.874 Mean 22.8 37.8 18.4 19.4 16.0 17.8 Standard Deviation 7.25 7.25 3.14 4.41 3.10 3.43 Total # of Iterations 442 762 388 394 330 329 Mean 22.1 38.1 19.4 19.7 16.5 16.4 Standard Deviation 5.75 9.88 4.42 4.20 4.07 3.79 692 G.R. BITRAN Ui _) < u u. oo CM ^ Q u. CM so so Q lU CTh f*^ " y u. oo f-i '^ U UJ s <> ■* CM O Q o In SO m u. o ;q V 03 u o ■^ V 03 Q 2 o' V ca u § i fN V < u. NA i rsi V < UJ r^ rn V < D so SO ^ -* so < U o <> t so rsi < £0 CXI V 1 S O P ^ b 3 I CQ < UJ u_ All oo CM oo Q u. Q a. All k. a. sO = Q tu a. All kj a. tr r^ V U 1- a. All k, a i rn V O UJ a. All a 5 V U Q Q a. All a o rsi oo CQ U_ 03 a All k, a o r^ V 03 UJ =0 a. All kj a o r^ V CD Q oa a All Q a o OS V CD U 00 a All 4 o OS 5 V < u. a All a. sO OS V < UJ a All kj a oo V < D a All Q a. sO OS sO OS o < (J a All a o OS f*> < CO a All T a oo <> V II Null hypotheses lesied ^ b 8 EXPERIMENTS WITH LINEAR FRACTIONAL PROBLEMS 693 To test if the observed difTerences in the number of iterations between algorithms is sta- tistically significant, we performed Wilcoxon's signed rank test [7]. The test was used to com- pare the algorithms pairwise. The null hypothesis is that the distributions of the number of iterations required by the pair of algorithms being tested are identical. Table 2 shows the results obtained. The first row in the table indicates the algorithms being compared. W is the Wilcoxon statistics, o- is its standard deviation, and a is the smallest level of significance for which the null hypothesis is rejected in a two-sided symmetrical Wilcoxon's test (a represents the sum of the two tails in the test). As an example, when comparing algorithms C and E the null hypothesis is rejected for any significance level greater than .2%. The values <.2 in the last row of the table indicate that the value of a for the corresponding tests is smaller than .2%. The results in Table 2 suggest that the distributions of the number of iterations required by algorithms C and D and E and F are not significantly different. A chi square test performed to test the null hypothesis that the distribution of the number of iterations for each algorithm can be approximated by a normal distribution showed that the null hypothesis cannot be rejected with a confidence level of .995. Under the assumption that the distributions of the number of iterations required by two algorithms X and Y are normal, Wilcoxon's test can be used to compare the means fix and fiy. The results of these tests are given in Table 3. W and a are respectively the Wilcoxon statistic and its standard deviation, a, in the last row of the table, is the smallest level of significance for which the null hypothesis is rejected in a one-sided test. The null hypothesis in all tests where algorithms E and F are compared with A, B, C, and D are rejected at very low levels of significance. REFERENCES [1] Bitran, G.R. and T.L. Magnanti, "Duality and Sensitivity Analysis for Fractional Pro- grams," Operations Research 24, 675-699 (1976). [2] Bitran, G.R. and A.G. Novaes, "Linear Programming with a Fractional Objective Func- tion," Operations Research 21, 22-29 (1973). [3] Bradley, S.P. and S.C. Frey, Jr., "Fractional Programming with Homogeneous Functions," Operations Research 22, 350-357 (1974). [4] Charnes, A. and W.W. Cooper, "Programming with Linear Fractional Functionals," Naval Research Logistics Quarterly 9, 181-186 (1962). [5] Isbell, J.R. and W.R. Marlow, "Attrition Games," Naval Research Logistics Quarterly 3, 71-93 (1956). [6] Martos, B., "Hyperbolic Programming," Naval Research Logistics Quarterly 11, 135-155 (1964). [7] Mosteller, F. and R.E.K. Rourke, Sturdy Statistics, Nonparametric and Order Statistics (Addison-Wesley Publishing Company, Inc., Reading, MA, 1973). [8] Schaible, S., "Parameter-Free Convex Equivalent and Dual Programs of Fractional Pro- gramming Problems," Zeitschrift fur Operations Research 18, 187-196 (1974). [9] Stancu-Minasian, I.M., "Bibliography of Fractional Programming 1960-1976," Preprint No. 3, February 1977. Academy of Economic Studies, Department of Economic Cybernet- ics, Bucaresti, Romania. [10] Wagner, H.M. and J. S.C. Yuan, "Algorithm Equivalence in Linear Fractional Program- ming," Management Science 14, 301-306 (1968). THE SENSITIVITY OF FIRST TERM NAVY REENLISTMENT TO CHANGES IN UNEMPLOYMENT AND RELATIVE WAGES* Les Cohen Government Services Division Kenneth Leventhal & Company Washington, D.C. Diane Erickson Reedy Mathtech, Inc. A Division of Mathematica, Inc. Rosslyn, Virginia ABSTRACT Multiple regression analysis was used to analyze newly developed twenty year time series of first term reenlistment rates for nine major Navy occupa- tional categories. Results indicate that there are significant differences among the occupational categories in the determinants of their reenlistment behavior. More importantly, it is apparent that reenlistment rates are highly sensitive to current unemployment and especially unemployment about the time of enlist- ment. By comparison, relative wages (measures of military versus private sec- tor rates of compensation) are relatively insignificant and appear powerless to control reenlistment in the context of normal fluctuations in economic activity. I. INTRODUCTION This paper reports the results of an analysis of first term reenlistment over the past twenty years. The study's principal objectives were to determine the uniqueness of reenlistment behavior in the Navy's major enlisted occupational categories, and in the process to measure the sensitivity of reenlistment to economic fluctuations and changes in military versus private sector rates of compensation. In addition to the war in Viet Nam and a large number of major social, political and tech- nological developments, the past 20 years (1958-1977) have seen highly varied economic activity. Following a long period of recovery through the Kennedy-Johnson-Viet Nam era, there have been two dramatic, successive recessions since 1969. Over the past twenty years, unemployment rates ranged from 3.5 percent to almost 9 percent, averaging 5.5 percent with a standard deviation of 1.4 percent. Against this background of economic fluctuations, first term reenlistment rates for each of the nine major Navy occupational categories which were studied demonstrated a bimodal or saddle-shaped pattern, with a mild rise during the early 1960's and considerably more dramatic •This research was supported by the Office of the Chief of Naval Operations, Systems Analysis Division, under contract N00074-78-C-0073 with Information Spectrum, Inc., Arlington, Virginia. 695 696 L. COHEN & D.E. REEDY increases in the early to mid-1970's. Reenlistment rates over the twenty years, on the average, ranged between 10 and 50 percent.* From individual monthly editions of NAVY MILITARY PERSONNEL STATISTICS, numbers of first term eligibles and reenlistments were recorded for each of the major occupa- tional categories reported in a given month. These categories ranged from a maximum of 28 to as few as 19. The need for consistency over the twenty year sample dictated the collapsing of these groups into 17 occupational categories for which data was present throughout the time series. These were in turn combined into the nine occupational groups on which the study focused. (Apprentice categories were added to their journeyman counterparts. Precision Equipment was combined with Electronics, and Dental with Medical.) A reduction in the number of categories was effected to increase the size of the statistics reported for each group and to minimize spurious movements in the reenlistment rates which would obscure the meaning of experimental findings. To further improve the quality of data, monthly observations were converted to quarterly, again for the purpose of increasing the number of eligibles to reasonable levels and to smooth out the time series to render real trends more readily intelligible. The resultant data base for nine Navy enlisted occupations contained reenlistment rates for 80 calendar quarters covering the twenty years from 1958 through 1977. II. METHODOLOGY The methodology which the study employed was multiple regression via ordinary least squares. The same equation was estimated for each of the nine occupational categories. It was decided that differences among occupations would be deduced from comparisons of individual variable performances and, to a much lesser extent, from the R^ statistics. No attempt was made to estimate the most effective equations for each occupation. Instead, variables and transformations were selected based on their frequency of significance and general impact across all occupations evaluated collectively. In all experiments, the dependent variable was the simple reenlistment rate, computed as in NAVY MILITARY PERSONNEL STATISTICS as the ratio of reenlistments to eligibles. Five types of independent variables were regressed against these reenlistment rates: 1. Constants, simple and seasonal 2. War Variables, constants and casualty counts representing the immediate, current period impact of the Viet Nam War 3. Motivational Variables describing the influence of the Draft and economic conditions at the time of enlistment 4. Current Economic Conditions in the Private Sector, aggregate and for occupation and industry labor market strata 5. Relative Wages, military versus private sector rates of compensation. Plots generated for selected occupational categories revealed six observations, concen- trated in the early phases of the sample, lying considerably above the normal range of values. *A1I reenlistment rates studied pertained strictly to the population of recruits remaining in service for their full terms, excluding all individuals who previously separated. No consideration was given to rates of attrition or their implications for the attributes of remaining eligibles. SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 697 Upon further investigation, it was determined that these outHers were related to involuntary extensions and early separation programs, the effects of which differed noticeably among occu- pations and from period to period within a single occupation time series.* Rather than reducing the sample size, outliers were replaced with the average of the rates from the preceding and fol- lowing quarters. While seasonal changes were not taken into account, this means of adjusting for outliers preserved the general trends of the time series. The basic equation around which experimentation evolved was defined as follows: RR= f(AUR, RW, DRAFT, WAR, S3) where AUR= current unemployment rate (seasonally adjusted) RW= ratio of military to private sector wages DRAFT= induction levels at the time of enlistment WAR= dummy for the Viet Nam War S3= third calendar quarter seasonal dummy III. TWENTY YEAR EQUATIONS Table 1 contains definitions for all independent variables to which reference will be made throughout the remainder of the paper. Table 2 describes equations estimated for all 80 obser- vations. The single equation used in Table 2 is the study's baseline equation from which all findings are derived. t The baseline equation has nine independent variables, plus a tenth (DBF), for "change in definition," for Electronics and for Engineering & Hull (E&H).t Referring to Table 2, the statistical significance of the third quarter seasonal dummy is evident. Generally reenlistment rates are down by 3 to 5 points in the Summer and early Fall, an accounting phenomenon which results from individuals extending their terms of service for convenience into the Summer. Almost invariably, persons requesting these short-term exten- sions do not reenlist, thereby driving rates of reenlistment downward. The effect of these extensions is regular and of the same magnitude among the occupations. DRFT18 represents draft levels in units of 10,000 lagged 18 periods prior to reenlistment, approximately six to nine months before enlistment.** This lag was determined experimentally, in light of uncertainty regarding the precise personal and legal points of commitment to the Navy. It is reasonable to assume that this six to nine month period is associated with deferred 'Early separation programs were instituted in 1958, 1960, and 1961. Involuntary extensions occurred in conjunction with Berlin (1961), Cuba (1962) and Viet Nam (1965). tA review of the correlation matrix for all variables indicated no signs of multicollinearity. tAs of the fourth calendar quarter of 1973, a change in BUPERS policy affected reenlistment rates in ratings with six year obligors. The effect of this procedural change fell primarily upon the Electronics and E&H categories which have heavy concentrations of long-term training programs. As a result, beginning in October 1973 Electronics and E&H first term reenlistment rates are artificially inflated. To compensate, the DEF dummy variable was added to the Electronics and E&H equations for the twenty year sample. As Table 2 indicates, DEF was significant for both occupations, indicating average overstatements of reenlist- ment rates of 27 and 10 percentage points for Electronics and E&H respectively over the period for which DEF was in effect. **With minor exceptions, the preponderance of Navy recruits covered by the sample enlisted for four years. Data prohibited the isolation of persons entering under three or six year programs. 698 L. COHEN & D.E. REEDY TABLE 1 . Variable Definitions Dependent Variable Independent Variables: Reenlistment Rate = Ratio of Reenlistees to Eiigibles (e.g., .56 = 56% reenlistment) C Constant AUR Aggregate Seasonally Adjusted Unemployment Rate (e.g., .06 = 6% unemployment) ARAUR Two Year Average Quarterly Rate of Change in AUR (e.g., -.07 = -7% average rate of decline in AUR) AUR13 Unemployment Rate 13 Periods Prior to Reenlistment (e.g., .06 = 6% unemployment) RW Relative Wages (E-4 Base Pay to Private Sector Earnings) (e.g., .45 indicates E-4 pay is 45% of private sector wages) ARATE Average Grade of Eiigibles (e.g., 4.01 indicates average one point above E-4) WAR Viet Nam War Dummy (1/1968 - 4/1972) DRFT18 Draft Levels 18 Periods Prior To Reenlistment (Inductees x 10"^) (e.g., .96 = 96,000 inductees) S3 3rd Calendar Quarter Seasonal Dummy DEF Dummy For Change in Definition (e.g., 6-year obligors; Electronics and E&H only) TABLE 2. Total, Twenty Year Sample Determinants of Navy Reenlistment (Quarterly: 1158-4177) Coefficients (t-Statistics) for all Variables Independent Variables C AUR ARAUR AUR13 RW ARATE WAR DRFT18 S3 DEF r' D-W Deck 4-0.52 -^3.42 -0.35 4-2.00 4-0.54 -0.19 + 0.02 -0.01 -0.03 .74 1.36 (2.13) (4.85) (1.77) (2.63) (6.16) (2.74) (0.91) (3.49) (2.65) Ordnance -0.36 44.68 -0.66 -1-2.21 + 0.69 40.01 + 0.08 -0.01 -0.05 .75 1.19 (1.05) (4.73) (2.41) (2.11) (5.67) (0.15) (2.50) (3.69) (2.96) Electronics -1.36 4 2.24 40.33 42.24 -hO.84 + 0.27 + 0.17 -0.01 -0.05 + 0.27 .88 1.01 & Prec. Equip. (2.76) (1.48) (.394) (1.38) (3.33) (1.96) (3.48) (1.89) (2.05) (4.77) Administration -f-0.45 -1-3.37 -0.36 40.92 -hO.26 -0.12 -0.05 -0.00 -0.05 .74 1.63 (1.93) (4.93) (1.91) (1.28) (3.04) (1.80) (2.31) (1.20) (3.86) Seaman 4-0.31 -1-1.84 -0.04 40.62 -0.02 -0.07 -0.08 + 0.00 -0.02 .63 1.42 (1.40) (2.86) (0.21) (0.90) (0.21) (1.17) (3,51) (1.53) (1.48) Engineering & Hull -h0.09 + \.2\ -0.02 -^0.95 -1-0.28 -0.02 -0.03 -0.00 -0.04 + 0.10 .81 1.62 (.43) (1.86) (.10) (1.32) (2.47) (.34) (1.39) (1.64) (3.14) (3.79) Construction -0.83 -0.47 -0.03 -1-2.78 -h0.18 + 0.21 -0.20 + 0.01 -0.03 .69 1.12 (2.64) (0.52) (0.10) (2.88) (1.61) • (2.31) (6.46) (3.59) (1.76) Aviation -0.00 4-3.69 -0.17 -1-1.47 4-0.28 -0.03 -0.04 -0.00 -0.03 .73 1.16 (0.00) (5.06) (0.84) (1.91) (3.08) (0.49) (1.62) (1.54) (2.61) Medical 4-0.13 -H4.05 -0.45 -1-0.65 -0.45 -0.30 -0.05 -0.01 -0.05 .51 0.94 & Dental (0.41) (4.31) (1.73) (0.65) (1.73) (0.30) (1.54) (1.32) (2.90) Significance: For 30 or more degrees of freedom — 90% level: t ^ 1.65 - 95% level: t > 1.96 SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 699 entrance and/or elapsed time between the decision and act of enlistment. DRFT18 was included to indicate the proportion of Navy eligibles who may have enlisted under pressure of the Draft. Presumably, draft-motivated enlistees were less prone to reenlist than their counter- parts who selected service in the Navy without otherwise being compelled by the threat of induction. Except in the case of Construction, DRFT18 coefficients were negative, though very small and significant for only four occupations. Draft motivation among Navy enlistees does not appear to have been, or promise to be of major importance for first term retention. The Viet Nam War dummy (1968-1972, all inclusive) displays unexpected differences in sign among the five categories for which it is significant. Intuitively, a negative coefficient makes sense. War is dangerous and military service in time of war is by definition a hazardous vocation. Alternatively, more challenging assignments, greater rates of advancement, and perhaps also a heightened sense of purpose may have caused the positive WAR coefficients for Ordnance and Electronics groups. It is important to note that the WAR dummy variable was substantially more effective than an alternative which measured all-service casualty counts. While casualties were present as early as 1961, the figures increased dramatically in the late 1960's, roughly in concert with the anti-War movement. It may be the effect of that movement upon attitudes toward the military, rather than the war itself, which the WAR dummy variable is capturing. Of central importance are the three unemployment variables in Table 2 which define national labor market activity at the time of enlistment and reenlistment. AUR is the national aggregate unemployment rate and is representative of the availability of private sector employ- ment opportunities and of the difficulty and uncertainty associated with finding employment. As anticipated, AUR is significant, with large positive coefficients for seven of the nine occupa- tions. Post-recession recoveries typically entail 3 to 4 points reduction in unemployment over 2 to 3 years. Taken literally, the AUR coefficients suggest these recoveries may precipitate reduc- tions of 15 to 25 points in reenlistment rates. As was generally true for tests of all variables in the equation, no exponential, logarithmic or other specification of AUR proved as effective as the untransformed variable. Experimenta- tion with polynomial distributed lag functions was unproductive. No industry or occupation unemployment rates were as effective explaining reenlistment as aggregate AUR. Using national data, correlations among these various rates of unemployment are in the high 90 per- centiles. Only local or regional statistics will show significantly different cyclical phasing for the different market strata to which different occupational groups might be sensitive. The purpose of the ARAUR variable is to provide the equation with a measure of dynam- ics. ARAUR is the average quarterly rate of change in AUR calculated over the previous six quarters, the period determined experimentally to be the most effective.* The more rapidly the unemployment rate is changing, the less likely the individual is to perceive or believe what is happening. For a given unemployment rate, the more rapidly the rate has changed to assume its current value, the higher (for AUR falling) or lower (for AUR rising) the reenlistment rate. This is thought to be the reason for the negative ARAUR coefficients. At least theoretically, an individual's reenlistment decision should be based upon his expectations of future private sector economic conditions. His sense of what he can earn in the private sector in both the near and distant future will be important. Short-term private sector earnings are his "opportunity cost," income he must forgo for Navy training and job experience. 'Experimentation was deemed appropriate because of uncertainty regarding the enlistee's time horizon. 700 L. COHEN & D.E. REEDY Long-term prospects in the private sector will characterize the rate of return he calculates for his investment in the Navy. In general, his expectations depend upon the current condition of the economy, plus the trend of developments in the economy which he will interpret, perhaps simply extrapolate into the future. Comparing two periods in which th'e unemployment rate is the same, tentative reenlistees should be relatively optimistic or pessimistic about their pros- pects in the private sector depending on whether unemployment has been falling or rising. A priori, dynamics were expected to be a significant factor for most, if not all occupations. Two major difficulties have probably inhibited the effectiveness of ARAUR which was significant in only three instances. One is that all such variables combine notions of speed with direction. Merging these two aspects of dynamics may be confusing if they solicit altogether different reactions from the tentative reenlistee. The second problem with dynamics is that the importance of information about trends diminishes at very high and very low values of AUR. When unemployment nears its high value in the individual's memory, its probability of falling in the future increases dramatically. Likewise, very low rates of unemployment will be expected to rise simply as an exponential function of the period over which they have persisted. Dynamics are probably most informative when unemployment rates are in their moderate range, when the future is more in doubt. Perhaps an overriding consideration is that enlistees' information about economic condi- tions is derived primarily from communications from their points of origin and duty stations. These decidedly local data describe economic conditions and relevant industry and occupational market activities in the community and region in which the tentative reenlistee will consider settling. No one actually obtains employment in the national economy. National economic conditions, to which AUR refers, are in fact often unrelated to the level and dynamics of an individual area economy, a point which must be kept in mind when interpreting the significance of all the unemployment and relative wage variables. It is interesting to note that the unemployment rate for 20-24 year olds was noticeably less effective than the aggregate rate of unemployment (AUR) which encompasses all age groups. One explanation is that data on the 20-24 year old cohort is characterized by labor market con- ditions in many of the major metropolitan areas, including those with large low income popula- tions, while the Navy traditionally draws from more rural and less metropolitan areas with pro- portionately greater numbers of lower-middle and middle income families. The significance of the unemployment variable, AUR13, should be especially sensitive to discrepancies between composite national and relevant local economic indicators. By the time his first term is nearing expiration, the individual is relatively distant from his home environ- ment and even somewhat isolated from his duty station. At the outset of his term during the period to which AUR13 refers, the individual has just left his home economy. He is as knowledgeable about that local economy as he will ever be. Ideally, the equations should refer- ence local economic conditions at the time of enlistment, rather than AUR13 which is a national unemployment rate. As shown in Table 2, AUR13 — unemployment lagged 13 periods prior to reenlistment— is significant in four instances. Most importantly, its coefficients are positive. AUR 13 is believed to have bearing on the enlistee's motivation and propensity to reenlist in two respects. Unemployment (economic conditions) about the time of enlistment establishes the climate in the context of which individuals, of significantly different purpose and motivation, decide to enlist. Alternatively, unemployment about the time of enlistment describes the background against which enlistees make definite, though tentative, career decisions very early during their 1 SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 701 first terms, decisions which are either consistent or inconsistent with reenHstment three or four years later. Before experimentation was undertaken with "motivational" unemployment rates such as AUR13, two hypotheses were formulated to anticipate their performance in the equation. From other studies in which the authors were engaged, preliminary evidence had been developed that enlistees are either job or training oriented and will react accordingly in different ways to changes in the economy. The latter group is less inclined to reenlist. Their tendency is to view the Navy as a paid vocational college where they can effect increases in their human capital in preparation for returning to the private sector. Job oriented individuals who are more concerned with immediate employment do not see the same private sector alternatives, and are more likely to be attracted to the Navy for its long-term career potential. Depending upon their orientation, these groups are believed to react to the Navy in oppo- site directions in response to economic fluctuations. The training oriented group is more prone to enlist when the economy is strong, attracted by advertised private sector positions which require skills and experience readily available from the Navy. More importantly, the composi- tion of enlistees changes when the economy is strong because the job oriented population need not rely as heavily upon the Navy as an alternative employer. Job oriented persons will favor enlistment when the unemployment rate is high, while sophisticated training oriented types should have greater success protecting their employment status and income. The latter are more employable at enlistment and four years later when their first term is completed. When unemployment is high, the enlisted population is more job oriented and will be characterized by a higher first term reenlistment rate, suggesting a positive, significant coefficient for unemploy- ment rates just prior to enlistment. Interestingly, experimentation established the clear superiority of AUR13 — six to nine months after enlistment — over any lagged unemployment variable going back to, or prior to the period of enlistment. This association with unemployment in the third quarter after enlist- ment is supportive of a second hypothesis. Early into his first term, the typical enlistee is form- ing opinions about the Navy and is making conscious career decisions regarding training and the degree of his commitment to potentially long-term service. It is at this time that conditions in the private sector (probably in his home town) are taken into account.* If his experiences have been generally negative, he may look more favorably upon the private sector. If the economy is healthy, he may decide not to work to enhance his status in the Navy beyond what he consid- ers necessary to maximize his success upon return to the private sector. Having never fully committed himself to the Navy, the enlistee never seriously considers extending his service through a second term. In contrast to unemployment variables to which policymakers can only react, RW (relative wages) is a parameter over which the Navy has direct control. (The relative wage variable was calculated with reference to E-4 base pay, excluding indirect benefits and bonuses, lump sum or installment. RW is the ratio of E-4 base pay to the earnings of private nonagricultural, nonsu- pervisory production workers.) Relative wages were significant for all but Seaman and Con- struction personnel, with predictably positive, though small coefficients. The implication is that relative wages, exclusive of bonuses or benefits, are ineffective as a policy variable to cause independent changes in reenlistment or to combat the effects of an improving economy. *During their first six months of service, recruits have two weeics leave, an early opportunity to evaluate their enlist- inent decisions at home in the company of family and friends, and with an unobstructed view of local private sector la- bor market conditions. 702 L. COHEN & D.E. REEDY Even more troublesome regarding the efficacy of relative wages as a policy instrument is that the meaning of the RW variable is suspect. Values for RW trace out a low grade exponen- tial curve which, especially in recent years, parallels quality of life improvements which have been implemented for enlisted personnel. RW is the only variable in the equation which fol- lows this general form and may simply be serving as a proxy for other factors favorable to reen- listment which have not, and probably could not be captured by the equations.* IV. EARLY AND RECENT SAMPLES The dynamics of the national economy cTianged after 1970. Following almost a decade during which the economy experienced continued improvement, the 1970's brought two sharp recessions. These fluctuations occurred in the context, and perhaps to some extent as a result of a barrage of new socio-political and technological phenomena and events which grew out of the Viet Nam Conflict and coincided with the maturing of post-World War II baby-boom labor. Even the legendary work ethic which has supposedly sustained the character of the American economy since its inception began to suff'er a noticeable loss of popularity. It can be assumed that, against this background of complex and rapid change, a new busi- ness cycle and labor force mentality have emerged from the late 1960's. Attitudes toward the military cannot have been unaffected. It follows that the relationships between reenlistment and its determinants, especially unemployment and relative wages, may have also been altered and diff'er now from what they were a decade ago. More than likely, a single model describing the complete twenty year time span will not be appropriate for projecting reenlistment behavior in the near future. Reenlistment over the next five to 10 years will probably be more con- sistent with its history during the very recent past, beginning in the late 1960's, early 1970's. To test for the existence of unique recent period relationships, the 80 quarter reenlistment time series was split into two 10 year samples, 1/1958-4/1967 and 1/1968-4/1977. f The results of the earlier sample are shown in Table 3. Compared to the equations based on all 80 observa- tions, these early sample equations are obviously less effective. In addition to the seasonal dummy, the equations are dominated by the unemployment rate variables, especially current period AUR. Neither the Draft variable nor RW (relative wages) was significant for a single occupational category. With the exception of the Aviation and Medical categories, the generally low R^ statistics indicate that the early sample equations are improperly, or more likely underspecified. DRFT18 and RW might be significant in the context of a more explicit, more complete model.* The point remains, however, that the same factors (with the exception of the WAR variable which was not defined prior to 1968) which explain reenlistment to a reason- able degree of effectiveness over twenty years have failed to repeat that achievement for a shorter time frame. The results of the later sample are shown in Table 4. Most notable, as compared to the earlier sample, are the R^ and constant terms.** The R^ statistics are impressive, and the *ARATE, the average rate of first term eligibles, exhibits a gradual, continual increase over time. ARATE was intro- duced to recognize differences in attitudes and earnings among enlistees which would be a function of levels of achievement. In fact, ARATE may be driven by reenlistment rates as a result of changes in promotion policies, t Analysis of correlation matrices indicated that multicollinearity was not a problem in either of these sample periods. tThe insignificance of the Draft variable (DRFT18) in the early equations probably derives from the peacetime period for which the variable was relevant. DRFT18 describes motivation at the time of enlistment, and therefore measures peacetime levels of inductees, 1953-1963. "The WAR variable, absent in the earlier sample, was not responsible for the generally remarkable performance of the recent sample equations. This conclusion was substantiated by tests which estimated equations for the recent sample without the WAR variable present. Changes in results were nominal, with only the Durbin-Watson statistics showing any appreciable deterioration. SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 703 TABLE 3. Early Ten Year Sample Determinants of Navy Reenlistment (Quarterly: 1/58-4/67) Coefficients (t-Statistics) For All Variables Independent Variables C AUR ARAUR AUR13 RW ARATE WAR DRFT18 S3 R^ D-W Deck + 0.58 + 2.65 -0.25 + 0.56 + 0.35 -0.16 — -0.01 -0.04 .36 1.70 (2.00) (1.94) (0.89) (0.71) (1.30) (2.13) (1.02) (2.79) Ordnance -1-0.59 -1.18 -0.09 + 1.45 -0.18 -0.06 -0.01 -0.05 .48 2.13 (1.86) (0.80) (0.28) (1.68) (0.61) (0.75) (1.53) (3.26) Electronics -0.56 -1.10 -0.01 + 3.01 + 0.42 + 0.15 -0.01 -0.07 .52 1.81 & Prec. Equip. (1.29) (0.54) (0.02) (2.52) (1.03) (1.29) (1.15) (3.01) Administration + 0.54 + 5.37 -0.71 -0.24 + 0.33 -0.16 -0.00 -0.06 .59 2.17 (1.81) (3.80) (2.45) (0.30) (1.17) (2.03) (0.64) (3.84) Seaman + 0.20 + 4.26 -0.32 -0.28 + 0.19 -0.08 -0.00 -0.02 .57 1.97 (0.76) (3.43) (1.26) (0.38) (0.78) (1.20) (0.47) (1.63) Engineering & Hull + 0.25 + 2.21 -0.13 + 0.18 + 0.19 -0.06 -0.00 -0.03 .36 2.17 (1.08) (2.06) (0.58) (0.29) (0.87) (0.93) (0.85) (2.90) Construction -0.25 + 3.31 -0.50 + 1.17 -0.20 + 0.07 + 0.00 -0.04 .51 1.52 (0.82) (2.34) (1.72) (1.41) (0.70) (0.92) (0.60) (2.61) Aviation + 0.08 + 5.27 -0.14 -0.24 -0.02 -0.03 -0.00 -0.04 .74 2.22 (-.34) (4.53) (0.57) (0.35) (0.08) (0.45) (1.07) (3.11) Medical + 0.22 + 7.69 -0.84 -1.48 -0.27 -0.05 -0.00 -0.06 .77 1.91 & Dental (0.70) (5.16) (2.75) (1.71) (0.90) (0.55) (0.51) (4.01) Significance: For 30 or more degrees of freedom — 90% level: t ^ 1.65 - 95% level: t ^ 1.96 704 L. COHEN & D.E. REEDY TABLE 4. Recent Ten Year Sample Determinants Of Navy Reenlislment (Quarterly: 1168-4177) Coefficients (t-Statistics) For All Variables Independent Variables C AUR ARAUR AUR13 RW ARATE WAR DRFT18 S3 R^ D-W Deck -1.12 + 5.25 -1.09 + 8.64 + 0.02 + 0.16 + 0.07 -0.00 1 -0.03 .88 1.98 (1.88) (4.56) (3.49) (5.08) (0.08) (1.13) (1.85) (1.59) (1.75) Ordnance -2.58 + 8.26 -1.44 + 6.30 -0.05 + 0.53 + 0.11 -0.01 -0.05 .78 1.42 (2.65) (4.37) (2.82) (2,26) (0.14) (2.32) (1.70) (1.39) (1.81) Electronics -1.42 + 7.16 + 0.61 + 3.82 + 0.61 + 0.28 -0.01 -0.01 -0.02 .73 1.01 & Prec. Equip. (0.86) (2.09) (0.71) (0.82) (1.01) (0.73) (0.10) (0.50) (0.39) Administration -1.20 + 2.80 -0.53 + 6.96 + 0.01 + 0.24 -0.02 -0.00 -0.03 .89 2.03 c o (2.35) (2.82) (2.00) (4.77) (0.08) (1.98) (0.52) (1.30) (2.08) 1 Seaman -1.38 -0.37 + 0.27 + 6.41 + 0.41 + 0.23 + 0.03 -0.00 -0.00 .81 1.30 3 o (6.41) (0.41) (1.11) (4.80) (2.41) (2.11) (1.03) (1.57) (0.33) ^-^ Engineering & Hull -1.56 + 1.23 -0.24 + 8.03 + 0.33 + 0.27 -0.01 -0.07 -0.02 .89 1.41 (2.86) (1.10) (0.78) (4.88) (1.57) (2.08) (0.22) (1.93) (1.31) Construction -1.68 -5.12 + 0.71 + 6.44 + 0.41 + 0.40 -0.22 + 0.01 -0.00 .89 1.72 (2.45) (3.85) (1.98) (3.29) (1.62) (2.51) (4.90) (1.51) (1.51) Aviation -1.34 + 1.29 -0.37 + 8.26 + 0.33 + 0.22 + 0.02 -0.01 -0.02 .91 2.12 (2.75) (1.90) (1.46) (5.92) (1.85) (1.95) (0.65) (2.56) (1.44) Medical -1.68 -1.22 + 0.28 + 8.22 + 0.80 + 0.22 -0.08 -0.01 -0.01 .79 1.54 & Dental (2.76) (1.04) (0,87) (4.74) (3.56) (1.90) (1.97) (2.18) (0.33) Significance; For 30 or more degrees of freedom — 90% level: t > 1.65 - 95% level: t > 1.96 constants generally suggestive of equations which describe different relationships than those captured by the early and total sample equations. Certainly, the single most important reason for the superior performance of the recent sample equations is the overall strength of the unemployment indicators, AUR, ARAUR and AUR13.* Assuming that attitudes toward mili- tary service have been altered, it is likely that they have changed to favor a heightened sensi- tivity to economic fluctuations. More and more recruits may be viewing military service strictly as a vocational decision. If they are training oriented, they watch the economy to discern when they can most efTectively capitalize on the increases in their human capital which the Navy is providing them. If they are job oriented and see the Navy as a fall-back alternative to private sector unemployment, they may remain in the Navy only until they detect better opportunities on the outside. Either way, job or training oriented, the enlistee will do his best to keep in touch with economic conditions, more now than in the 1960's when economic motivations were less influential relative to social and personal psychological factors. •The negative coefficient for AUR in the Construction equation (Table 4) may indicate that Navy Construction person- nel identify with the public works (infrastructure) component of the construction industry. Public works construction is sometimes undertaken as part of counter-cyclical employment programs, and must precede or follow residential and commercial/industrial development. As such, public works employment may experience cycles approaching 180 de- grees out of phase with other construction activities which are more closely associated with movements in the national unemployment rate. This assumption about Navy construction personnel is not necessarily inconsistent with the typically positive coefficient for AUR13. Six to nine months into his first term, the new recruit may not yet consider himself affiliated with the construction industry. SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 705 RW is significant and positive for only three occupational categories, and of minor impact in these instances. Redefining relative wages based upon Regular Military Compensation did nothing to increase the number of significant cases, nor did it affect significant RW coefficients substantially. Taking into account housing and subsistence allowances and the associated tax advantages produced results which were no more impressive. V. FINDINGS AND IMPLICATIONS For the analysis, conceptualization and timing of programs designed to anticipate or con- trol reenlistment, of the three models (early or recent samples, or all 80 observations) the recent sample equations would seem to be most pertinent for two reasons. First, economic conditions in the near future are more likely to resemble the early 1970's than the 1960's. Those social, political and economic phenomena which have precipitated the new dynamics of recent fluctuations are likely to persist. Second, contemporary attitudes toward military service are part of a generally irreversible evolution of mores and traditions. Intuitively, changes in attitudes should involve an increasingly heavy emphasis by recruits upon the vocational aspects of service. Sensitivities to, and knowledge of economic conditions should be increasing, perhaps as indicated by the performance of unemployment rate variables, particularly AUR13, in the recent sample equations. Quite probably, it is these relationships which are responsible for the effectiveness of equations based on the entire twenty year time series. Assuming that neither war nor the Draft are likely to repeat themselves in the near future, the only recurrent determinants of reenlistment — other than subjective factors not cap- tured by the equations — are unemployment (economic conditions) and relative wages. Rela- tive wages, now that parity between military and private sector compensation is guaranteed by law, are effectively constant, excluding the payment of bonuses. That leaves the economy, and perhaps bonuses also, as the dominant influences which will characterize reenlistment behavior in the near future. (Time series data on which the study was based precluded incorporating lump sum or installment bonuses into either the RW variable or elsewhere in the equations.) As Table 5 indicates, with reference only to unemployment rate variables, it is possible to explain from 51 to 86 percent of the variation in reenlistment rates over the past ten years.* The significance of the unemployment variables is particulary impressive considering: 1. The equations include no quality of life indicators representative of improvements which have occurred to enlisted working conditions. 2. No sampling has occurred to separate individuals by sex, ethnicity, family status or mental group. 3. The unemployment rate data being used is national and aggregate. No direct refer- ences have been made to local economic conditions in the enlistee's home area or duty station. Without question, reenlistment rates are highly sensitive to economic conditions at reenlistment and enlistment, represented by unemployment rates as indicators of the availability and difficulty of securing private sector employment. The effect of current unemployment is quite strong, but the effects of unemployment about the time of enlistment, AUR13, is especially pronounced. *Some R statistics are undoubtedly biased upwards due to the presence of positive serial correlation. 706 L. COHEN & D.E. REEDY C « a. 3 u o o TABLE 5. Recent Ten Year Sample Determinants of Navy Reenlistment (Quarterly 1168-4177) Coefficients (t-Statistics) for Significant Variables Independent Variables C AUR ARAUR AUR13 R2 D-W Deck -0.30 + 4.77 -0.97 + 6.28 .83 1.54 (6.80) (7.41) (4.80) (6.07) Ordnance + 7.89 (7.27) -1.23 (3.61) .68 1.19 Electronics + 10.44 .69 0.83 & Prec. Equip. (6.39) Administration -0.26 + 4.02 -0.78 + 6.69 .85 1.92 (6.92) (7.14) (4.41) (7.41) Seaman -0.22 (5.85) + 1.79 (3.18) + 4.77 (5.27) .67 1.12 Engineering & Hull -0.30 + 3.88 -0.78 + 7.83 .82 1.36 (6.70) (5.84) (3.72) (7.34) Construction -0.33 (3.70) + 10.27 (4.82) .51 0.43 Aviation -0.32 + 3.95 -0.81 + 7.40 .86 1.70 (8.16) (6.84) (4.48) (7.98) Medical -0.15 + 2.13 + 5.46 .56 1.03 & Dental (2.75) (2.63) (4.19) Note; Coefficients and t-statistics are omitted for variables not significant at ttie 90% level. For 30 or more degrees of freedom — 90% level: t > 1.65; 95% level: t ^ 1.96. The effects of AUR and AUR 13 combined can be devastating for reenlistment. Compare two "classes" of recruits, one enlisting and coming up for reenlistment during peaks in the econ- omy, the other during low points. Assuming these peaks and troughs are separated by only two percent, six to eight percent unemployment for example, the total difference in reenlistment rates between the two groups could be as high as 27 to 29 percent for Deck and Ordnance, as low as three percent for Construction. Striking by comparison is the poor performance of relative wages, the variable RW. Sta- tistically significant for only three occupations, its effect in those cases is nominal, especially considering the magnitude of the unemployment rate coefficients. Calculating the rates of sub- stitution between unemployment and relative wages, the latter appears ineffective as a means of protecting reenlistment rates from a healthy, vigorous private sector. Comparison of equations suggests that the Navy's enlisted workforce is not a homogene- ous mass for which precisely the same reenlistment programs would be appropriate. Differences in reactions to the economy and relative wages among the nine major occupational groups which were studied are evident and of significant order of magnitude. The policy implication is that different occupations should be treated differently to effect comparable degrees of control over their reenlistment rates. SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 707 Unemployment rates, current and at the time of enlistment, are the principal deter- minants of reenlistment. The effects of changes in current economic conditions (AUR) are substantial, in the neighborhood of from 4 to 5 percentage points change in reenlistment rates for every 1 point change in unemployment. Moreover, the significance and power of the AUR13 variable is particularly important for the nature and timing of programs designed to increase reenlistment. It is apparent that the propensity to reenlist is to a great extent deter- mined very early in the first term, at or about the time of enlistment. The importance of AUR13 has profound implications for the incidence of recruiting expenditures and early enlist- ment counseling programs. Finally, although for some occupations they may be a significant determinant reenlistment, military relative to private sector rates of compensation are generally powerless to compensate for the effects of economic fluctuations. To the extent that bonuses paid in installments over long periods are viewed as regular wages, they may be ineffective as reenlistment incentives. Other than their importance for enlistment, relative wages can probably be allowed to deteriorate when economic conditions favor rising reenlistment rates.* More importantly, the order of magnitude of increases in mili- tary compensation — excluding bonuses — necessary to protect reenlistment rates when economic conditions are driving those rates down is almost assuredly financially and politically unacceptable. No evidence was discovered indicating that reenlistment has been affected by variations in benefits for housing, subsistence, or tax advantages. If relative wages are not effective as a program instrument to control reenlistment, the Navy has only two alternatives: lump-sum incentive grants which are more impressive, at least more visible than wage increases (or installment bonuses) and probably less expensive for a given impact; and procedural changes affecting the pace of promotions and intra-Navy job and occupational mobility so as to bring the career patterns of enlisted personnel more in line with those of their private sector counterparts. The latter may be especially important in light of the orientation toward employment opportunities and lack of concern for direct and indirect mone- tary compensation which this study has identified. Differences between Navy and private sector rates of achievement and mobility — vertical and horizontal — may produce a degree of discontentment and sense of falling behind the private sector. This problem has potentially serious ramifications for first and even second term reenlistment, and is clearly a policy issue deserving immediate study. Unemployment rates are factors to which the Navy can react or for which it can prepare, but not control. Since relative wages are now basically constant and if they are generally powerless to counteract economic stimuli, substantial lump sum bonus programs and improvements in career development paths may be necessary to counteract or nullify the link between business cycles and reenlistment. VI. RESERVATIONS AND LIMITATIONS There are the standard set of reservations associated with regression analysis and all methods of statistical inference involving correlation. Specifically with respect to the equations discussed above, there are a few problems. The signs of some coefficients are troublesome. *There is evidence that the major proportion of tentative enlistees are either oblivious or insensitive to rates of mihtary compensation. At most, those who do not enlist may be somewhat put off by a general impression that military pay is relatively low compared to the private sector. See for example, D. Grissmer, et al., "An Econometric Analysis of Volunteer Enlistments by Service and Cost Effectiveness Comparison of Service Incentive Programs," OAD-CR-66, General Research Corporation, October 1974. 708 L. COHEN & D.E. REEDY « ft The Durbin-Watson statistics are generally indicative of serial correlation among error terms, ' more severe for some occupations than others. As noted above, the RW variable is suspect. Its mildly exponential behavior is coincidental with a number of other phenomena which might ' have also affected reenlistment. Despite these difficulties, the equations do well and are reliable for what they convey about the direction and order of magnitude of the effects of unemploy- ment upon reenlistment. The principal limitations to direct policy application of these findings derive from the sim- plicity of the time series data on which they are based. Without specific ratings data, reenlist- ment bonuses could not be taken into account. Available time series data prohibit effecting any ' controls for personal or socio-demographic considerations, notably ethnicity, sex, family status and especially mental group. The time series analysis above fails to pinpoint who is leaving and to explain precisely why they leave. Equally important, the data have prevented any reference to labor market conditions in either the enlistee's duty station or home economy where he might consider settling. Local economies differ substantially in the way they experience phases of any national cycle, differences which a complete analysis of reenlistment behavior must take into account. To a great extent, compensation for these disadvantages of time series analysis could be accom- plished via supplementary cross-sectional analysis of area reenlistment rates using the Enlisted Master File or some related set of records. ACKNOWLEDGMENTS The authors gratefully acknowledge the technical assistance of Ms. Deborah Coffin without whom this paper would have suffered considerable loss of substance and detail, and the valuable criticisms and comments of Drs. Alfred Rhode and John Martin of Information Spec- trum, Inc., Mr. Irwin Schiff and LCDR Kevin Delaney of OP-964D, and Mr. Samuel Kleinman of the Center for Naval Analyses. BIBLIOGRAPHY [1] Albrecht, M., "A Discussion of Some Applications of Human Captial Theory to Military Manpower Issues," P-5727, RAND, September 1976. [2] Bryan, J. and A. Singer, "Prediction of Reenlistment Using Regression Estimation of Event Probabilities," Research Contribution No. 13, Center for Naval Analyses, October 1965. [3] Cooper, R., "The All-Volunteer Force: Five Years Later," P-6051, RAND, December 1977. [4] Enns, J., "Effect of the Variable Reenlistment Bonus on Reenlistment Rates: Empirical Results for FY-71," R-1502-ARPA, RAND, June 1975. [5] Grissmer, D., et al., "An Econometric Analysis of Volunteer Enlistments by Service and Cost Effectiveness Comparison of Service Incentive Programs," OAD-CR-66, General Research Corporation, October 1974. [6] Haber, S. and C. Stewart, Jr., "The Responsiveness of Reenlistment to Changes in Navy Compensation," TR-1254, George Washington University, May 1975. [7] Lindsay, W., Jr. and B. Causey, "A Statistical Model for the Prediction of Reenlistment," TP-342, Research Analysis Corporation, March 1969. [8] Lindsay, W., et al., "Simple Regression Models for Estimating Future Enlistment and Reenlistment in Army Manpower Planning," TP-402, Research Analysis Corporation," September 1970. [9] Lockman, R., et al., "Motivational Factors in Accession and Retention Behavior," Research Contribution 201, Center for Naval Analyses, January 1972. SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 709 [10] Massell, A., "An Imputation Method for Estimating Civilian Opportunities Available to Military Enlisted Men," R-1565-ARPA, RAND, July 1975. [11] Massell, A., "Reservation Wages and Military Reenlistments," P-55336, RAND, February 1976. [12] Nelson, G., "An Economic Analysis of First Term Reenlistments in the Army," P-647, Institute for Defense Analyses, June 1970. [13] Quigley, J. and R. Wilburn, "An Economic Analysis of First Term Reenlistment in the Air Force," AFPDPL-PR-69-017, Personnel Research and Analysis Division, Directorate of Personnel Planning, USAF, September 1969. [14] Young & Rubicam, Inc., "Naval Retention: A Problem of Empathy," May 1970. INDEX TO VOLUME 26 ALAM, K. "Distribution of Sample Correlation Coefficients," Vol. 26, No. 2, June 1979, pp. 327-330. AL-AYAT, R. and R. Fare, "On the Existence of Joint Production Functions," Vol. 26, No. 4, Dec. 1979,, pp. 627-630. ARMSTRONG, R.D. and E.L. Frome, "Least-Absolute-Value Estimators for One-Way and Two-Way Tables," Vol. 26, No. 1, Mar. 1979, pp. 79-96. BARLOW, R.E., "Geometry of the Total Time on Test Transform," Vol. 26, No. 3, Sept. 1979, pp. 393-402. BARZILY Z., W.H. Marlow and S. Zacks, "Survey of Approaches to Readiness," Vol. 26, No. 1, Mar. 1979, pp. 21-31. BAZARAA, M.S. and A.N. Elshafei, "An Exact Branch-and-Bound Procedure for the Quadratic-Assignment Problem," Vol. 26, No. 1, Mar. 1979, pp. 109-121. BERG, M. and B. Epstein, "A Note on a Modified Block Replacement Policy for Units with Increasing Marginal Run- ning Cost," Vol. 26, No. 1, Mar. 1979, pp. 157-160. BHAT, U.N., M. Shalaby and M.J. Fischer, "Approximation Techniques in the Solution of Queueing Problems," Vol. 26, No. 2, June 1979, pp. 311-326. BITRAN, G.R., "Experiments With Linear Fractional Problems," Vol. 26, No. 4, Dec. 1979, pp. 689-693. BULFIN, R.L., R.G. Parker and CM. Shetty, "Computational Results with a Branch-and-Bound Algorithm for the General Knapsack Problem," Vol. 26, No. 1, Mar. 1979, pp. 41-46. BUTLER, D.A., "A Complete Importance Ranking for Components of Binary Coherent Systems, With Extensions to Multi-State Systems," Vol. 26, No. 4, Dec. 1979, pp. 565-578. CHANDRA, R., "On n/l/F Dynamic Deterministic Problems," Vol. 26, No. 3, Sept. 1979, pp. 537-544. CHARNETSKI, JR. and R.M. Soland, "Multiple-Attribute Decision Making With Partial Information: The Expected- Value Criterion," Vol. 26, No. 2, June 1979, pp. 249-256. CHAUDHRY, ML., "The Queueing System M^G/l and its Ramifications," Vol. 26, No. 4, Dec. 1979, pp. 667-674. COHEN, L. and D.E. Reedy, "The Sensitivity of First Term Navy Reelistment to Changes in Unemployment and Rela- tive Wages," Vol. 26, No. 4, Dec. 1979, pp. 695-709, COOPER, L. and J. Kennington, "Nonexireme Point Solution Strategies For Linear Programs," Vol. 26, No. 3, Sept. 1979, pp. 447-461. CRAVEN, B.D. and B. Mond, "A Note on Duality in Homogeneous Fractional Programming," Vol. 26, No. I, Mar. 1979, pp. 153-155. CURRAN, R.T., S.C. Jaquette and J.L. Politzer, "Damage Calculations for Unreliable Warheads," Vol. 26, No. 3, Sept. 1979, pp. 545-550. DEGROOT, M.H., "Bayesian Estimation and Optimal Designs in Partially Accelerated Life Testing," Vol. 26, No. 2, June 1979, pp. 223-235. DERMAN, C, G.J. Lieberman and S.M. Ross, "Adaptive Disposal Models," Vol. 26, No. I. Mar. 1979, pp. 33-40. ELMAGHRABY, S.E. and P.S. Pulat, "Optimal Project Compression With Due-Dated Events," Vol. 26, No. 2, June 1979, pp. 331-348. FISK, J.C. and M.S. Hung, "A Heuristic Routine for Solving Large Loading Problems," Vol. 26, No. 4, Dec. 1979, pp. 643-650. FISK, J. and P. McKeown, "The Pure Fixed Charge Transportation Problem," Vol. 26, No. 4, Dec. 1979, pp. 631-641. GITTINS, J.C. and DM. Roberts, "The Search for an Intelligent Evader Concealed in One of an Arbitrary Number of Regions," Vol. 26, No. 4, Dec. 1979, pp. 651-666. GOLDEN, B.L. and F.B. Alt, "Interval Estimation of a Global Optimum for Large Combinatorial Problems," Vol. 26, No. 1, Mar. 1979, pp. 69-77. GRAVES, S.C. and J. Keilson, "A Methodology for Studying the Dynamics of Extended Logistic Systems," Vol. 26, -No. 2, June 1979, pp. 169-197. GUPTA, R.K., V. Srinivasan and P.L. Yu, "Optimal State-Dependent Pricing Policies for a Class of Stochastic Multiunit Service Systems," Vol. 26, No. 2, June 1979, pp. 257-283. HELGASON, R.V. and J.L. Kennington, "A New Storage Reduction Technique for the Solution of the Group Prob- lem," Vol. 26, No. 4, Dec. 1979, pp. 681-687. HODGSON, T.J. and G.J. Koehler, "Computation Techniques for Large Scale Undiscounted Markov Decision Processes," Vol. 26, No. 4, Dec. 1979, pp. 587-594. ISERMANN, H., "The Enumeration of all Efficient Solutions for a Linear Multiple-Objective Transportation Problem," Vol. 26, No. 1, Mar. 1979, pp. 123-139. JEFFERSON, T.R., G.M. Folie and C.H. Scott, "Duality for Quasi-Concave Programs With Application to Economics," Vol. 26, No. 4, Dec. 1979, pp. 611-625. 711 712 INDEX TO VOLUME 26 JOSHI, P.C, "On the Moments of Gamma Order Statistics," Vol. 26, No. 4, Dec. 1979, 675-679. KARMARKAR, U.S., "Convex/Stochastic Programming and Multilocation Inventory Problems," Vol. 26, No. 1, Mar. 1979, pp. 1-19, LEAVENWORTH, R.S. and R.L. Scheaffer, "Design of a Process Control Scheme for Defects Per 100 Units Based on AOQL," Vol. 26, No. 3, Sept. 1979, pp. 463-485. LEWIS, P.A.W. and G.S. Shedler, "Simulation of Nonhomogeneous Poisson Processes by Thinning," Vol. 26, No. 3, Sept. 1979. pp. 403-413. LUSS, H., "A Capacity-Expansion Model for Two Facility Types," Vol. 26, No. 2, June 1979, pp. 291-303. MISRA, R.B., "A Note on Optimal Inventory Management Under Inflation," Vol. 26, No. 1, Mar. 1979, pp. 161-165. MITCHELL, C.R. and A.S. Paulson, "M/M/1 Queues with Interdependent Arrival and Service Processes," Vol. 26, No. 1, Mar. 1979, pp. 47-56. MUCKSTADT, J. A., "A Three-Echelon, Multi-Item Model for Recoverable Items," Vol. 26, No. 2, June 1979, pp. 199-221. MURPHY, F.H. and A.L. Soyster, "Multiproduct Lot-Size Scheduling with Proportional Product Demands," Vol. 26, No. 1, Mar. 1979, pp. 97-108. NEMHAUSER, G.L. and G.M. Weber, "Optimal Set Partitioning, Matchings and Lagrangian Duality," Vol. 26, No. 4, Dec. 1979, pp. 553-563. PEGDEN, CD. and C.C. Petersen, "An Algorithm {GIPC2) for Solving Integer Programming Problems With Separ- able Nonlinear Objective Functions," Vol. 26, No. 4, Dec. 1979, pp. 595-609. PINEDO, M. and G. Weiss, "Scheduling of Stochastic Tasks on Two Parallel Processors,'-' Vol. 26, No. 3, Sept. 1979, pp. 527-535. RAMANl, K.V., "Some Bayes Tests and their Asymptotic Properties for the Multivariate, Multisample Goodness-of-Fit Problem," Vol. 26, No. 2, June 1979, pp. 237-247^ ROSENBERG, D., "A New Analysis of a Lot-Size Model With Partial Backlogging," Vol. 26, No. 2, June 1979, pp. 349-353. ROSS, S.M. and J. Schechtman, "On the First Time a Separately Maintained Parallel System has beer\ Down for a Fixed Time," Vol. 26, No. 2, June 1979, pp. 285-290. SHANTHIKUMAR, J.G., "On a Single-Server Queue With State-Dependent Service," Vol. 26, No. 2, June 1979, pp. 305-309. SHIMSHAK, D.G., "A Comparison of Waiting Time Approximations in Series Queueing Systems, Vol. 26, No. 3, Sept. 1979, pp. 499-509. SHOGAN, A.W., "A Single Server Queue with Arrival Rate Dependent on Server Breakdowns," Vol. 26, No. 3, Sept. 1979, pp. 487-497. SIEGMUND, D., "Confidence Intervals Related to Sequential Test for the Exponential Distribution," Vol. 26, No. 1, Mar. 1979, pp. 57-67. SILVER, E.A., "Coordinated Replenishments of Items Under Time-Varying Demand: Dynamic Programming Formula- tion," Vol. 26, No. 1, Mar. 1979, pp. 141-151. ' SUBELMAN, E.J., "Optimal Betting Strategies for Favorable Games," Vol. 26, No. 2, June 1979, pp. 355-363. TAMIR, A., "Scheduling Jobs to Two Machines Subject to Batch Arrival Ordering, Vol. 26, No. 3, Sept. 1979, pp. 521-525. TAYLOR J.G., "Some Simple Victory-Prediction Conditions for Lanchester-Type Combat Between Two Homogeneous Forces With Supporting Fire," Vol. 26, No. 2, June 1979, pp. 365-375. THIAGARAJAN, T.R. and CM. Harris, "Statistical Tests for Exponential Services from M/G/1 Waiting-Time Data," Vol. 26, No. 3, Sept. 1979, pp. 511-520. WAGNER, H.M., "The Next Decade of Logistics Research," Vol. 26, No. 3, Sept. 1979, pp. 377-392. WEISS, L., "The Asymptotic Distribution of Order Statistics," Vol. 26, No. 3, Sept. 1979, pp. 437-445. WHITE, C.C, III, "Bounds on Optimal Cost for a Replacement Problem with Partial Observations," Vol. 26, No. 3, Sept. 1979, pp. 415-422. ZACKS, S., "Survival Distributions in Crossing Fields Containing Clusters of Mines with Possible Detection and Unc- ertain Activation or Kill," Vol. 26, No. 3, Sept. 1979, pp. 423-435. ZUCKERMAN, D., "A Diffusion Model for the Control of a Multipurpose Reservoir System," Vol. 26, No. 4, Dec. 1979, pp. 579-586 irU.S. GOVERNMENT PRINTING OFFICE; 1979 O — 305-622 i INFORMATION FOR CONTRIBUTORS The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of scientific information in logistics and will publish research and expository papers, including those in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one or more referees. Manuscripts submitted for publication should be typewritten, double-spaced, and the author should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted with the original. A short abstract (not over 400 words) should accompany each manuscript. This will appear at the head of the published paper in the QUARTERLY. There is no authorization for compensation to authors for papers which have been accepted for publication. Authors will receive 250 reprints of their published papers. Readers are invited to submit to the Managing Editor items of general interest in the held of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections of the QUARTERLY. NAVAL RESEARCH LOGISTICS QUARTERLY DECEMBER 1979 VOL. 26, NO. 4 NAVSO P-1278 CONTENTS ARTICLES Optimal Set Partitioning, M atchings and Lagrangian Duality A Complete Importance Ranking for Components of Binary Coherent Systems, With Extensions to Multi-State Systems A Diffusion Model for the Control of A Multipurpose Reservoir System Computation Techniques for Large Scale Undiscounted Markov Decision Processes An Algorithm (GIPC2) for Solving Integer Programming Problems With Separable Nonlinear Objective Functions Duality for Quasi-Concave Programs With Application to Economics On the Existence of Joint Production Functions The Pure Fixed Charge Transportation Problem A Heuristic Routine for Solving Large Loading Problems The Search for an Intelligent Evader Concealed in One of an Arbitrary Number of Regions The Queueing System MVG/1 and its Ramifications On the Moments of Gamma Order Statistics A New Storage Reduction Technique for the Solution of the Group Problem Experiments With Linear Fractional Problems - The Sensitivity of First Term Navy Reenlistment to Changes in Unemployment and Relative Wages Index Page G. L. NEMHAUSER G. M. WEBER 553 D. A. BUTLER 565 D. ZUCKERMAN 579 T. J. HODGSON G. J. KOEHLER 587 C. D. PEGDEN C. C. PETERSEN 595 T. R. JEFFERSON G. M. FOLIE C. H. SCOTT 611 R. AL-AYAT R. FARE 627 J. FISK P. MCKEOWN 631 J. C. FISK M. S. HUNG 643 J. C. GITTINS D. M. ROBERTS 651 M. L. CHAUDHRY 667 P. C. JOSHI 675 R. V. HELGASON L. KENNINGTON 681 G. R. BITRAN 689 L. COHEN D. E. REEDY 695 711 OFFICE OF NAVAL RESEARCH Arlington, Va. 22217